From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on yquem.inria.fr X-Spam-Level: X-Spam-Status: No, score=0.4 required=5.0 tests=AWL,DNS_FROM_RFC_ABUSE autolearn=disabled version=3.1.3 Received: from mail4-relais-sop.national.inria.fr (mail4-relais-sop.national.inria.fr [192.134.164.105]) by yquem.inria.fr (Postfix) with ESMTP id BDB3ABC6B for ; Mon, 7 Jan 2008 18:31:40 +0100 (CET) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: Aq4HALLwgUdDWxLC/2dsb2JhbACBWKZJ X-IronPort-AV: E=Sophos;i="4.24,254,1196636400"; d="scan'208";a="20983977" Received: from ip67-91-18-194.z18-91-67.customer.algx.net (HELO server1.bertec.net) ([67.91.18.194]) by mail4-smtp-sop.national.inria.fr with ESMTP; 07 Jan 2008 18:31:40 +0100 Received: from kuba.bertec.net (kuba.bertec.net [192.168.2.16]) by server1.bertec.net (Postfix) with ESMTP id 48916105830 for ; Mon, 7 Jan 2008 12:31:39 -0500 (EST) From: Kuba Ober To: caml-list@yquem.inria.fr Subject: Re: [Caml-list] Performance questions, -inline, ... Date: Mon, 7 Jan 2008 12:31:38 -0500 User-Agent: KMail/1.9.6 (enterprise 0.20071123.740460) References: <200801031128.30183.ober.14@osu.edu> <200801071441.48212.jon@ffconsultancy.com> <47825AA0.3020702@mcmaster.ca> In-Reply-To: <47825AA0.3020702@mcmaster.ca> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Content-Disposition: inline Message-Id: <200801071231.38571.ober.14@osu.edu> X-Spam: no; 0.00; -inline:01 high-level:01 metaocaml:01 yup:01 compiler:01 syntax:01 inlining:01 trivial:01 syntax:01 lexer:01 afaik:01 unclean:01 gcc:01 higher-level:01 gcc:01 On Monday 07 January 2008, Jacques Carette wrote: > Jon Harrop wrote: > > You mean it might be possible to recover the performance of C from > > numerical code with high-level abstractions? Yes. Indeed, I would like = to > > see this done. However, I've never heard of an implementation of any > > language that can do this. > > > With MetaOCaml you can -- see either the long version > http://www.cas.mcmaster.ca/~carette/scp_metamonads.pdf > or the more condensed version > http://www.cas.mcmaster.ca/~carette/metamonads/index.html > Yup, that's what my homegrown "compiler" does, although it is all LISP, and= =20 not really nicely structured - it has grown out of an assembler with LISP=20 macros. Eventually the macros became powerful enough to do register=20 allocation and optimizations, it was a short road to getting the "assembler= "=20 to dig generic expression input in LISP syntax too. Function inlining=20 and "hoisting" are mostly trivial syntax tree operations -- once you have t= he=20 predicates written to tell you whether it's safe to optimize in a given way. =46rom day 1 it was a LISP-syntax assembler, of course. Which may look fun= ny,=20 but I never had to write a lexer for it, and "parsing" LISP isn't much :) I= n=20 about 12klocs, for my application it generates assembly mostly on par with= =20 what I could ever write - 95% is DSP operations in fixed point (matrix by=20 vector multiplies, vector adds, MACs (FIR/IIR)). It also handles saturation= s=20 where necessary - I've added a "saturated integer" type which gracefully=20 saturates on overflow. Kinda necessary in control loops. To perform well, some of my applications can't even be coded in C unless yo= u'd=20 use some (AFAIK nonexistent) C-extensions, or resort to relatively unclean= =20 techniques like inlineable assembly functions. That's what gcc digs very=20 well, but it becomes mostly unmaintainable crap for someone who only knows = a=20 higher-level language and assembly, but has no gcc experience. Most languag= es=20 are also perfect at hiding the processor's state/flags register from you, a= nd=20 it's nigh impossible to expect the optimizer to deduce that all you want to= =20 do is long addition or multiplication. I can see (if it gets there) my OCam= l=20 front-end by default having the result type of + being an (int, bool) tuple= =20 that coerces to int by default, with the bool being the carry status. Thank= =20 $deity almost all CPUs know what carry means, so this could be pretty=20 portable if I were after that -- which I'm not. I gave up on working with gcc codebase mostly because C is a very unexpress= ive=20 language for anything scientific-related (I wholeheartedly agree with Jon=20 Harrop here). And I don't like the C++ "bolted-on" functional template=20 metaprogramming either - it reads like a Turing-machine program to me. It i= s=20 pretty nice in theory, but in practice I found it distracting. And the erro= r=20 messages are a yet another story. But enough rambling, I have some thinking to do :) Cheers, Kuba