From mboxrd@z Thu Jan 1 00:00:00 1970 Received: (from weis@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id NAA23745 for caml-redistribution; Thu, 11 Mar 1999 13:36:36 +0100 (MET) Received: from concorde.inria.fr (concorde.inria.fr [192.93.2.39]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id KAA11434 for ; Thu, 11 Mar 1999 10:44:46 +0100 (MET) Received: from pauillac.inria.fr (pauillac.inria.fr [128.93.11.35]) by concorde.inria.fr (8.8.7/8.8.7) with ESMTP id KAA05985; Thu, 11 Mar 1999 10:44:43 +0100 (MET) Received: (from xleroy@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id KAA11345; Thu, 11 Mar 1999 10:44:43 +0100 (MET) Message-ID: <19990311104442.30284@pauillac.inria.fr> Date: Thu, 11 Mar 1999 10:44:42 +0100 From: Xavier Leroy To: Alexey Nogin , caml-list@inria.fr Cc: Jason Hickey , Alexei Kopylov Subject: Re: Upgrade from OCaml 2.01 to OCaml 2.02 made things _slower_! References: <19990305114112.34610@pauillac.inria.fr> <36E73321.D37B49F6@CS.Cornell.EDU> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Mailer: Mutt 0.89.1 In-Reply-To: <36E73321.D37B49F6@CS.Cornell.EDU>; from Alexey Nogin on Wed, Mar 10, 1999 at 10:06:09PM -0500 Sender: weis > After I upgraded OCaml & Camlp4 from 2.01 to 2.02, our the native code > of our program became much smaller (5124462 instead of 7219378), but it > also became a little slower (1.5% - 3.5% for various inputs). Do you > have an idea what could have caused it? In the past, I've observed speed variations by at least +/- 5% caused exclusively by minor variations in code placement (such as adding or deleting instructions that are never executed). Almost any modification in the code generator affects code placement. If only for this reason, speed variations of less than 5% are essentially meaningless: there's no way to attribute them to a particular otpimization or to good/bad luck in code placement. (Makes you very suspicious of those PLDI papers where they report 1% speedups...) > Also, I was doing some performance mesurements (using P6 performance > counter support patches for Linux by Erik Hendriks - > http://beowulf.gsfc.nasa.gov/software/ ) when I upgraded, so I have some > information (and can get more of it) on the performance counters for my > program under both 2.01 and 2.02. In particular, the number of requests > from the processor to the L1 data cache became 2%-3% bigger. That's more meaningful. The two new optimizations in 2.02 (closed toplevel functions and allocation coalescing) should reduce the number of memory accesses. Allocation coalescing might increase register pressure locally, causing other stuff to be spilled on the stack, though. Is there any way you could get a per-function profile of memory requests? (like on the Alpha with the Digital Unix tools). - Xavier Leroy