From mboxrd@z Thu Jan 1 00:00:00 1970 Received: (from majordomo@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id RAA23898; Sun, 4 Jan 2004 17:49:29 +0100 (MET) X-Authentication-Warning: pauillac.inria.fr: majordomo set sender to owner-caml-list@pauillac.inria.fr using -f Received: from concorde.inria.fr (concorde.inria.fr [192.93.2.39]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id RAA23404 for ; Sun, 4 Jan 2004 17:49:28 +0100 (MET) Received: from pauillac.inria.fr (pauillac.inria.fr [128.93.11.35]) by concorde.inria.fr (8.11.1/8.11.1) with ESMTP id i04GnLH14381; Sun, 4 Jan 2004 17:49:21 +0100 (MET) Received: (from xleroy@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id RAA23578; Sun, 4 Jan 2004 17:49:20 +0100 (MET) Date: Sun, 4 Jan 2004 17:49:20 +0100 From: Xavier Leroy To: dolfi@zkm.de Cc: caml-list@inria.fr Subject: Re: [Caml-list] novice puzzled by speed tests Message-ID: <20040104174920.A20953@pauillac.inria.fr> References: <32894.80.131.128.216.1073134644.squirrel@bild.zkm.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Mailer: Mutt 1.0i In-Reply-To: <32894.80.131.128.216.1073134644.squirrel@bild.zkm.de>; from dolfi@zkm.de on Sat, Jan 03, 2004 at 01:57:24PM +0100 X-Loop: caml-list@inria.fr X-Spam: no; 0.00; caml-list:01 toying:01 ocamlopt:01 -unsafe:01 mandrake:01 slower:01 ocamlopt:01 gcc:01 slower:01 -unsafe:01 long-time:01 miracles:01 inverted:01 segv:01 speedup:01 Sender: owner-caml-list@pauillac.inria.fr Precedence: bulk > Toying around with 3.07, I found that ocamlopt.opt -unsafe (on Mandrake > 9.1, Pentium 4, 2.4 GHz) actually produces slower code than ocamlopt.opt. > On my box, the corresponding C program (gcc -O3) is slightly slower than > the ocamlopt.opt compiled O'Caml program, but about 25-30% faster than the > -unsafe one: > Of course it's good that range checking increases the speed of programs, > but, being a long-time C user, I'm a little bit puzzled by miracles like > this. I suspected that the sense of the -unsafe flag was inverted, but it > isn't: the -unsafe program dies with SEGV when I deliberately introduce a > range overflow, the safe one gets an exception. Welcome to the wonderful world of modern processors. It's not uncommon to observe "absurd" speed differences of up to 20%. By "absurd" I mean for instance adding or removing dead code (never executed) and observing a slowdown, or executing more code and observing a speedup. As far as I can guess, this is due to two processor features: - Lots of instruction-level parallelism is available. Thus, if your main code doesn't use all of the computational units, adding extra code (such as array bound checks) that can execute in parallel doesn't reduce execution speed. - Performance is very sensitive to code placement. Things like code cache conflicts, (mis-) alignment of branch targets, and oddities in the instruction decoding logic can cause insertion *or deletion* of a few instructions to have significant impact on execution speed. These are just wild guesses. The bottom line is that processors have become so complex that explaining observed performances (let alone predicting performances) has become nearly impossible, at least for small performance variations (say, less than a factor of 1.5). (This makes compiler writers very unhappy, because they used to make a living by cranking out 5% speed improvements, which are now lost in the overall noise :-) If you have access to a good performance analysis tool, such as Intel's VTune, you could run it on your three executables and see if some rational explanation comes out of VTune's figures. But I wouldn't bet on it. - Xavier Leroy ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners