From mboxrd@z Thu Jan 1 00:00:00 1970 Received: (from weis@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id IAA04860 for caml-redistribution; Fri, 12 Mar 1999 08:48:10 +0100 (MET) Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id BAA30258 for ; Fri, 12 Mar 1999 01:00:02 +0100 (MET) Received: from simon.cs.cornell.edu (SIMON.CS.CORNELL.EDU [128.84.154.10]) by nez-perce.inria.fr (8.8.7/8.8.7) with ESMTP id AAA07063; Fri, 12 Mar 1999 00:59:57 +0100 (MET) Received: from cloyd.cs.cornell.edu (CLOYD.CS.CORNELL.EDU [128.84.227.15]) by simon.cs.cornell.edu (8.9.3/8.9.3/R-2.1) with ESMTP id SAA17717; Thu, 11 Mar 1999 18:59:57 -0500 (EST) Received: from CS.Cornell.EDU (nogin@GULAG.CS.CORNELL.EDU [128.84.248.53]) by cloyd.cs.cornell.edu (8.8.8/8.8.8/M-1.12) with ESMTP id SAA14262; Thu, 11 Mar 1999 18:59:56 -0500 (EST) Sender: weis Message-ID: <36E858F5.8FAF2CF9@CS.Cornell.EDU> Date: Thu, 11 Mar 1999 18:59:49 -0500 From: Alexey Nogin Organization: Cornell University, department of Computer Science X-Mailer: Mozilla 4.08 [en] (X11; U; Linux 2.0.37 i686) MIME-Version: 1.0 To: Xavier Leroy CC: caml-list@inria.fr, Jason Hickey , Alexei Kopylov , Paul Stodghill Subject: Re: Upgrade from OCaml 2.01 to OCaml 2.02 made things _slower_! References: <19990305114112.34610@pauillac.inria.fr> <36E73321.D37B49F6@CS.Cornell.EDU> <19990311104442.30284@pauillac.inria.fr> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Xavier Leroy wrote: > In the past, I've observed speed variations by at least +/- 5% caused > exclusively by minor variations in code placement (such as adding or > deleting instructions that are never executed). Almost any > modification in the code generator affects code placement. If only > for this reason, speed variations of less than 5% are essentially > meaningless: there's no way to attribute them to a particular > otpimization or to good/bad luck in code placement. (Makes you very > suspicious of those PLDI papers where they report 1% speedups...) > > > Also, I was doing some performance mesurements (using P6 performance > > counter support patches for Linux by Erik Hendriks - > > http://beowulf.gsfc.nasa.gov/software/ ) when I upgraded, so I have some > > information (and can get more of it) on the performance counters for my > > program under both 2.01 and 2.02. In particular, the number of requests > > from the processor to the L1 data cache became 2%-3% bigger. > > That's more meaningful. The two new optimizations in 2.02 (closed > toplevel functions and allocation coalescing) should reduce the number > of memory accesses. Allocation coalescing might increase register > pressure locally, causing other stuff to be spilled on the stack, > though. Well, in this case I should probably try to remove the allocation coalescing and see what happens. Am I right assuming that in order to do that I have to revert changes for versions 1.8 -> 1.9 and 1.10 -> 1.11 of the asmcomp/selectgen.ml? > Is there any way you could get a per-function profile of > memory requests? (like on the Alpha with the Digital Unix tools). I am not sure. I could probably write something gprof-like that would record the values of the performance counters at each function call, but I am afraid that's a lot of work. And I could probably get access to Alpha, but I do not think I will see the same slowdown effect on Alpha as I see on x86, so the Alpha memory access numbers would not help much. Alexey -------------------------------------------------------------- Home Page: http://www.cs.cornell.edu/nogin/ E-Mail: nogin@cs.cornell.edu (office), ayn2@cornell.edu (home) Office: Upson 4139, tel: 1-607-255-4934 ICQ #: 24708107 (office), 24678341 (home)