Mailing list for all users of the OCaml language and system.
 help / color / mirror / Atom feed
From: Xavier Leroy <xavier.leroy@inria.fr>
To: dolfi@zkm.de
Cc: caml-list@inria.fr
Subject: Re: [Caml-list] novice puzzled by speed tests
Date: Sun, 4 Jan 2004 17:49:20 +0100	[thread overview]
Message-ID: <20040104174920.A20953@pauillac.inria.fr> (raw)
In-Reply-To: <32894.80.131.128.216.1073134644.squirrel@bild.zkm.de>; from dolfi@zkm.de on Sat, Jan 03, 2004 at 01:57:24PM +0100

> Toying around with 3.07, I found that ocamlopt.opt -unsafe (on Mandrake
> 9.1, Pentium 4, 2.4 GHz) actually produces slower code than ocamlopt.opt.
> On my box, the corresponding C program (gcc -O3) is slightly slower than
> the ocamlopt.opt compiled O'Caml program, but about 25-30% faster than the
> -unsafe one:
> Of course it's good that range checking increases the speed of programs,
> but, being a long-time C user, I'm a little bit puzzled by miracles like
> this. I suspected that the sense of the -unsafe flag was inverted, but it
> isn't: the -unsafe program dies with SEGV when I deliberately introduce a
> range overflow, the safe one gets an exception.

Welcome to the wonderful world of modern processors.  It's not
uncommon to observe "absurd" speed differences of up to 20%.  By
"absurd" I mean for instance adding or removing dead code (never
executed) and observing a slowdown, or executing more code and
observing a speedup.

As far as I can guess, this is due to two processor features:

- Lots of instruction-level parallelism is available.  Thus, if your
main code doesn't use all of the computational units, adding extra code
(such as array bound checks) that can execute in parallel doesn't
reduce execution speed.

- Performance is very sensitive to code placement.  Things like code
cache conflicts, (mis-) alignment of branch targets, and oddities in the
instruction decoding logic can cause insertion *or deletion* of a few
instructions to have significant impact on execution speed.

These are just wild guesses.  The bottom line is that processors have
become so complex that explaining observed performances (let alone
predicting performances) has become nearly impossible, at least for
small performance variations (say, less than a factor of 1.5).  

(This makes compiler writers very unhappy, because they used to make a
living by cranking out 5% speed improvements, which are now lost in
the overall noise :-)

If you have access to a good performance analysis tool, such as
Intel's VTune, you could run it on your three executables and see if
some rational explanation comes out of VTune's figures.  But I
wouldn't bet on it.

- Xavier Leroy

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


  reply	other threads:[~2004-01-04 16:49 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-01-03 12:57 dolfi
2004-01-04 16:49 ` Xavier Leroy [this message]
2004-01-04 20:49   ` Brian Hurt
2004-01-05 19:50   ` [Caml-list] camlp4 Ker Lutyn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20040104174920.A20953@pauillac.inria.fr \
    --to=xavier.leroy@inria.fr \
    --cc=caml-list@inria.fr \
    --cc=dolfi@zkm.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox