From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on yquem.inria.fr X-Spam-Level: X-Spam-Status: No, score=0.0 required=5.0 tests=AWL autolearn=disabled version=3.1.3 Received: from mail3-relais-sop.national.inria.fr (mail3-relais-sop.national.inria.fr [192.134.164.104]) by yquem.inria.fr (Postfix) with ESMTP id C5ECBBC8E for ; Fri, 11 Jan 2008 00:04:25 +0100 (CET) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: Ao8CAPIyhkfUnw7Xnmdsb2JhbACCNI1qAQEBAQcCCCmZGg X-IronPort-AV: E=Sophos;i="4.24,268,1196636400"; d="scan'208";a="7658303" Received: from fhw-relay07.plus.net ([212.159.14.215]) by mail3-smtp-sop.national.inria.fr with ESMTP; 11 Jan 2008 00:04:25 +0100 Received: from [80.229.56.224] (helo=beast.local) by fhw-relay07.plus.net with esmtp (Exim) id 1JD6Rk-0007jH-CL for caml-list@yquem.inria.fr; Thu, 10 Jan 2008 23:04:24 +0000 From: Jon Harrop Organization: Flying Frog Consultancy Ltd. To: caml-list@yquem.inria.fr Subject: Re: [Caml-list] ocaml doesn't need to optimize on amd64?? Date: Thu, 10 Jan 2008 22:56:42 +0000 User-Agent: KMail/1.9.7 References: <200801090922.00231.ober.14@osu.edu> <200801091714.52294.jon@ffconsultancy.com> In-Reply-To: <200801091714.52294.jon@ffconsultancy.com> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200801102256.42193.jon@ffconsultancy.com> X-Spam: no; 0.00; ocaml:01 ocaml:01 ocaml's:01 high-level:01 compiler:01 implements:01 high-level:01 compiler:01 ocaml's:01 bigloo:01 implements:01 lacks:01 low-level:01 afaik:01 unboxing:01 On Wednesday 09 January 2008 17:14:52 Jon Harrop wrote: > On Wednesday 09 January 2008 14:22:00 Kuba Ober wrote: > > Jon & al, > > > > why do you think that OCaml doesn't need to do certain > > optimizations on amd64? > > OCaml does a (much) better job of code generation on AMD64. I was in a bit of a rush when I wrote that and I'd like to explain what I meant in more detail. OCaml's AMD64 code gen is so good that I have never heard of a reason to drop to C for high performance code. That is simply no longer an issue, so we are free to stay in the land of safety and high-level concepts which is enormously valuable in practice because it saves so much developer time. In particular, this is more important than having a compiler that implements the high-level optimizations we discussed because such a compiler can never match the performance of C if its AMD64 code gen is not as good as OCaml's. This is one of the reasons why I continue to choose OCaml over SML/NJ, MLton, GHC6.8, SBCL, Bigloo and almost all other FPL compilers: they have worse AMD64 code gens and imposes a significantly lower speed limit upon their users. If you look at the ray tracer benchmark: http://www.ffconsultancy.com/languages/ray_tracer/results.html Three of the five OCaml programs are faster than any SML compiled with MLton even though MLton implements lots of high-level optimizations that OCaml does not. You can draw two crucial conclusions from this: 1. Even though OCaml lacks some high-level optimizations, you don't have to put much effort in before OCaml beats MLton-compiled SML because OCaml's AMD64 code gen is so good. In particular, two of those three OCaml implementations don't even bother implementing any of the low-level optimizations that we've discussed at all! 2. OCaml lets you choose how much you want to optimize your code right up to the performance of C but MLton imposes quite a low speed limit: 55% slower than the front runners. Once you hit that limit with MLton your only option is to drop to C (or OCaml ;-). However, dropping into another language imposes its own performance hits and is even likely to undermine the compiler's optimization efforts. So I agree that it would be very nice if OCaml implemented more optimizations along these lines but I choose OCaml with its excellent code gen over a more optimizing compiler that didn't have such a good code gen every day of the week and twice on Sundays. AFAIK, these high-level optimizations are never likely to get implemented in OCaml. My first vote would actually go to a different (and more fundamentally important) optimization anyway: arbitrary unboxing. One of the nice things about F# and GHC is that you can specify in your type declaration that the type is to be stored unboxed. This can have profound performance implications. Even in standalone code, unboxing arrays of complex numbers (which OCaml does not do) makes FFTs 5x faster. In the context of FFIs, the performance improvement can be much bigger because you can completely avoid the cost of copying huge quantities of data (e.g. color/texcoord/normal/position struct arrays in vertex buffer objects for OpenGL). In contrast, the overhead of lambda abstraction in numerical code is "only" a factor of 2. PS: Kuba, your C code will most likely run a significantly faster in 64-bit as well. -- Dr Jon D Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/products/?e