From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on yquem.inria.fr X-Spam-Level: * X-Spam-Status: No, score=1.0 required=5.0 tests=AWL,SPF_NEUTRAL autolearn=disabled version=3.1.3 Received: from mail4-relais-sop.national.inria.fr (mail4-relais-sop.national.inria.fr [192.134.164.105]) by yquem.inria.fr (Postfix) with ESMTP id AE416BC6B for ; Fri, 11 Jan 2008 01:42:12 +0100 (CET) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AgAAAOpJhkdC+VLui2dsb2JhbACQHgEBAQgEBCQFkwuGJQ X-IronPort-AV: E=Sophos;i="4.24,269,1196636400"; d="scan'208";a="21111185" Received: from wx-out-0506.google.com ([66.249.82.238]) by mail4-smtp-sop.national.inria.fr with ESMTP; 11 Jan 2008 01:42:15 +0100 Received: by wx-out-0506.google.com with SMTP id h27so795754wxd.15 for ; Thu, 10 Jan 2008 16:42:10 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:from:reply-to:to:subject:date:user-agent:cc:references:in-reply-to:content-type:content-transfer-encoding:content-disposition:message-id; bh=MIxsntsFn1WugSuOL+UJW7jZ2GsejtAGIbiQeNmH8hY=; b=IrPhzDJ876WvAvoVXhDRs/tMMQ+cmZcGSWCg7B2gfXLk6+HvC3PHVYZU4XgurUNj59oHJw4DlRIz019xcU10QNDoG1p7TZ412IdgLXWK6TbZY+N4gy0yc4LKHPfUAnDZbpPTQso1QN8HnPd9fIugt4HErKgDHXbuJt4OWWTJnb0= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=from:reply-to:to:subject:date:user-agent:cc:references:in-reply-to:content-type:content-transfer-encoding:content-disposition:message-id; b=tKfplgcoApoVUtIVexRl03EZzOJ3SZDgExxarSj2acN8jZ5XW/WeSlW3Q+CdT0bVLec7ojkKFCHm6CVlzRwHoy8kIvLo2EyReAai6DK651bwWLOT0gEmadocd1ZsRgknEQyIyJ1NzxhXNEFSPDmkgeqfTX1S28mLeWcEXv4uTPs= Received: by 10.70.17.1 with SMTP id 1mr1675741wxq.28.1200012130816; Thu, 10 Jan 2008 16:42:10 -0800 (PST) Received: from tama-chan ( [24.99.180.210]) by mx.google.com with ESMTPS id h14sm4214314wxd.26.2008.01.10.16.42.09 (version=TLSv1/SSLv3 cipher=OTHER); Thu, 10 Jan 2008 16:42:09 -0800 (PST) From: Peng Zang Reply-To: peng.zang@gmail.com To: caml-list@yquem.inria.fr Subject: Re: [Caml-list] ocaml doesn't need to optimize on amd64?? Date: Thu, 10 Jan 2008 19:42:05 -0500 User-Agent: KMail/1.9.7 Cc: Jon Harrop References: <200801090922.00231.ober.14@osu.edu> <200801091714.52294.jon@ffconsultancy.com> <200801102256.42193.jon@ffconsultancy.com> In-Reply-To: <200801102256.42193.jon@ffconsultancy.com> Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200801101942.08123.peng.zang@gmail.com> X-Spam: no; 0.00; ocaml:01 hash:01 ocaml's:01 ocaml:01 ocaml's:01 high-level:01 compiler:01 implements:01 high-level:01 compiler:01 bigloo:01 implements:01 lacks:01 low-level:01 afaik:01 -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hey Jon, I was wondering if you've had any experience with OCaml's speed and quality of code generate on Intel 64 platforms. Eg. a Core 2 Duo or one of the more recent Xeons. Thanks, Peng On Thursday 10 January 2008 05:56:42 pm Jon Harrop wrote: > On Wednesday 09 January 2008 17:14:52 Jon Harrop wrote: > > On Wednesday 09 January 2008 14:22:00 Kuba Ober wrote: > > > Jon & al, > > > > > > why do you think that OCaml doesn't need to do certain > > > optimizations on amd64? > > > > OCaml does a (much) better job of code generation on AMD64. > > I was in a bit of a rush when I wrote that and I'd like to explain what I > meant in more detail. > > OCaml's AMD64 code gen is so good that I have never heard of a reason to > drop to C for high performance code. That is simply no longer an issue, so > we are free to stay in the land of safety and high-level concepts which is > enormously valuable in practice because it saves so much developer time. > > In particular, this is more important than having a compiler that > implements the high-level optimizations we discussed because such a > compiler can never match the performance of C if its AMD64 code gen is not > as good as OCaml's. > > This is one of the reasons why I continue to choose OCaml over SML/NJ, > MLton, GHC6.8, SBCL, Bigloo and almost all other FPL compilers: they have > worse AMD64 code gens and imposes a significantly lower speed limit upon > their users. > > If you look at the ray tracer benchmark: > > http://www.ffconsultancy.com/languages/ray_tracer/results.html > > Three of the five OCaml programs are faster than any SML compiled with > MLton even though MLton implements lots of high-level optimizations that > OCaml does not. You can draw two crucial conclusions from this: > > 1. Even though OCaml lacks some high-level optimizations, you don't have to > put much effort in before OCaml beats MLton-compiled SML because OCaml's > AMD64 code gen is so good. In particular, two of those three OCaml > implementations don't even bother implementing any of the low-level > optimizations that we've discussed at all! > > 2. OCaml lets you choose how much you want to optimize your code right up > to the performance of C but MLton imposes quite a low speed limit: 55% > slower than the front runners. Once you hit that limit with MLton your only > option is to drop to C (or OCaml ;-). However, dropping into another > language imposes its own performance hits and is even likely to undermine > the compiler's optimization efforts. > > So I agree that it would be very nice if OCaml implemented more > optimizations along these lines but I choose OCaml with its excellent code > gen over a more optimizing compiler that didn't have such a good code gen > every day of the week and twice on Sundays. > > AFAIK, these high-level optimizations are never likely to get implemented > in OCaml. My first vote would actually go to a different (and more > fundamentally important) optimization anyway: arbitrary unboxing. One of > the nice things about F# and GHC is that you can specify in your type > declaration that the type is to be stored unboxed. This can have profound > performance > implications. > > Even in standalone code, unboxing arrays of complex numbers (which OCaml > does not do) makes FFTs 5x faster. In the context of FFIs, the performance > improvement can be much bigger because you can completely avoid the cost of > copying huge quantities of data (e.g. color/texcoord/normal/position struct > arrays in vertex buffer objects for OpenGL). In contrast, the overhead of > lambda abstraction in numerical code is "only" a factor of 2. > > PS: Kuba, your C code will most likely run a significantly faster in 64-bit > as well. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.7 (GNU/Linux) iD8DBQFHhrtgfIRcEFL/JewRAqgFAKCm1Px3uMP8nEq5lrIp/vHdbIvSsgCgiDSJ NpRFs4RKR+W5b0S0n9SIkAo= =Y81S -----END PGP SIGNATURE-----