* Re: [Caml-list] Ocaml sums the harmonic series -- four ways, four benchmarks: floating point performance
@ 2005-01-16 9:57 Philippe Lelédy
0 siblings, 0 replies; 12+ messages in thread
From: Philippe Lelédy @ 2005-01-16 9:57 UTC (permalink / raw)
To: Caml-list
Xavier Leroy wrote:
> done;
> !sum +. 0.0;;
>
> The + 0.0 at the end is ugly but convinces ocamlopt that !sum is best
> kept unboxed during the loop.
Here are my times which show little difference w/ or w/o this hack
On 1.8 GHz PowerPC G5 (MacOS X 10.3.7, Objective Caml version 3.08.0)
./sumH4 1000000000 17.65s user 0.16s system 91% cpu 19.461 total
./sumH5 1000000000 16.17s user 0.11s system 91% cpu 17.702 total
On Intel(R) Pentium(R) 4 CPU 3.00GHz (Debian GNU/Linux, Objective Caml version 3.08.2)
./sumH4 1000000000 15,57s user 0,00s system 99% cpu 15,646 total
./sumH5 1000000000 15,45s user 0,00s system 99% cpu 15,480 total
Ph. L.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Ocaml sums the harmonic series -- four ways, four benchmarks: floating point performance
@ 2005-01-13 15:53 Will M. Farr
2005-01-13 17:29 ` [Caml-list] " John Prevost
` (2 more replies)
0 siblings, 3 replies; 12+ messages in thread
From: Will M. Farr @ 2005-01-13 15:53 UTC (permalink / raw)
To: shootout-list; +Cc: caml-list
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
I've been looking at using ocaml to implement a gravitational n-body
code, and therefore have quite a bit of interest in its floating-point
performance. Also, I'm learning the language by playing around with
simple programs. Here's an implementation (really 4) along with timing
information of the "harmonic" benchmark (essentially summing the
harmonic series), which can be found here:
http://shootout.alioth.debian.org/sandbox/benchmark.php?
test=harmonic&lang=all&sort=cpu
After testing different ways of implementing the ocaml harmonic
benchmark, I have settled on the following program. For sizes of 1 000
000 000 terms, it takes about 25% longer than the corresponding
algorithm in c (note that I have replaced an int->float conversion for
each term with a single floating point operation: ifloat := ifloat +.
1.0). Since int->float conversions are slow on my machine (PowerBook
G4), this is a big win (about a factor of 2 in time for the C program).
Alas, when the number of terms approaches 16 digits, this method will
lose accuracy, since <~16-digit number> +. 1.0 = <16-digit number +
difference in last bit of mantissa>. However, for sizes like the
shootout seems to be using, this algorithm works fine (and the usual
int type won't hold anything close to 16 digits anyway!). I'm cc-ing
this to the caml list because there may be people there interested in
the floating point performance of Caml
Here's the code for the fastest implementation:
let sum_harmonic4 n =
let sum = ref 1.0 in
let ifloat = ref 2.0 in
for i = 2 to n do
sum := !sum +. 1.0/.(!ifloat);
ifloat := !ifloat +. 1.0
done;
!sum;;
let _ =
let n = int_of_string (Sys.argv.(1)) in
Printf.printf "%g\n" (sum_harmonic4 n);;
And here's all the implementations I tried (for those interested in
such things with ocaml):
let sum_harmonic n =
let rec loop i sum =
if i <= n then
loop (i + 1) (sum +. 1.0/.(float_of_int i))
else
sum in
loop 2 1.0;;
let sum_harmonic2 n =
let sum = ref 1.0 in
for i = 2 to n do
sum := !sum +. 1.0/.(float_of_int i)
done;
!sum;;
let sum_harmonic3 n =
let rec loop i ifloat sum =
if i <= n then
loop (i + 1) (ifloat +. 1.0) (sum +. 1.0/.ifloat)
else
sum in
loop 2 2.0 1.0;;
let sum_harmonic4 n =
let sum = ref 1.0 in
let ifloat = ref 2.0 in
for i = 2 to n do
sum := !sum +. 1.0/.(!ifloat);
ifloat := !ifloat +. 1.0
done;
!sum;;
let _ =
let n = int_of_string (Sys.argv.(1)) in
Printf.printf "%g\n" (sum_harmonic4 n);;
The timings for my machine (PowerBook G4, 800 Mhz) are as follows:
time ./harmonic 1000000000:
harmonic: user 2m1.590s
sys 0m0.790s
harmonic2: user 2m0.340s
sys 0m0.440s
harmonic3: user 1m44.350s
sys 0m0.740s
harmonic4: user 1m12.680s
sys 0m0.430s
Each invocation was compiled with "ocamlopt -unsafe -noassert -o
harmonic harmonic.ml". It looks like using references and loops is *by
far* the fastest (and also that my PowerBook is pretty slow to convert
int->float, but I don't think this is related to ocaml, since the C
version does the same thing).
Hope you all find this interesting.
Will
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.0 (Darwin)
iD8DBQFB5pl3jFCrhUweU3MRApDzAJ9Ysln/KTQcq4WzxT9060GcDAgKQwCfTsb0
mDm4UyyghIz7m7r4ZpGcI3o=
=dLDI
-----END PGP SIGNATURE-----
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Caml-list] Ocaml sums the harmonic series -- four ways, four benchmarks: floating point performance
2005-01-13 15:53 Will M. Farr
@ 2005-01-13 17:29 ` John Prevost
2005-01-13 19:01 ` Will M. Farr
2005-01-15 11:55 ` Xavier Leroy
2005-01-23 2:27 ` Oliver Bandel
2 siblings, 1 reply; 12+ messages in thread
From: John Prevost @ 2005-01-13 17:29 UTC (permalink / raw)
To: Will M. Farr; +Cc: shootout-list, caml-list
On Thu, 13 Jan 2005 10:53:16 -0500, Will M. Farr <farr@mit.edu> wrote:
> Each invocation was compiled with "ocamlopt -unsafe -noassert
> -o harmonic harmonic.ml". It looks like using references and
> loops is *by far* the fastest (and also that my PowerBook is
> pretty slow to convert int->float, but I don't think this is
> related to ocaml, since the C version does the same thing).
Note that this is dependent on what CPU you're using. On my test
system (700MHz AMD Athlon with 256MB of memory), I saw this behavior:
time ./harmonic 1000000000:
harmonic:
you: 2m01.590s .. 0m00.790s
me: 0m30.811s .. 0m00.120s
harmonic2:
you: 2m00.340s .. 0m00.440s
me: 0m30.847s .. 0m00.140s
harmonic3:
you: 1m44.350s .. 0m00.740s
me: 0m38.002s .. 0m00.130s
harmonic4:
you: 1m12.680s .. 0m00.430s
me: 1m14.603s .. 0m00.220s
So on this system, harmonic4 is by far the slowest, and the fastest
version is the one that uses float_of_int and tail recursion. It's
unclear to me how much of this is that the Intel compiler is simply
better optimized than the PPC compiler.
John.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Caml-list] Ocaml sums the harmonic series -- four ways, four benchmarks: floating point performance
2005-01-13 17:29 ` [Caml-list] " John Prevost
@ 2005-01-13 19:01 ` Will M. Farr
2005-01-13 20:24 ` John Prevost
0 siblings, 1 reply; 12+ messages in thread
From: Will M. Farr @ 2005-01-13 19:01 UTC (permalink / raw)
To: John Prevost; +Cc: caml-list, shootout-list
Is the PowerPC ocamlopt back-end less optimized than the x86? I didn't
realize that ocamlopt did enough optimizations that the backend would
be substantially different on the different architectures (in the
manual they say that it compiles the code essentially as written -- no
loop unrolling, etc). Are you sure that there isn't just a built-in
instruction on the x86 that adds an int to a float?
Will
On 13 Jan 2005, at 12:29 PM, John Prevost wrote:
> On Thu, 13 Jan 2005 10:53:16 -0500, Will M. Farr <farr@mit.edu> wrote:
>> Each invocation was compiled with "ocamlopt -unsafe -noassert
>> -o harmonic harmonic.ml". It looks like using references and
>> loops is *by far* the fastest (and also that my PowerBook is
>> pretty slow to convert int->float, but I don't think this is
>> related to ocaml, since the C version does the same thing).
>
> Note that this is dependent on what CPU you're using. On my test
> system (700MHz AMD Athlon with 256MB of memory), I saw this behavior:
>
> time ./harmonic 1000000000:
>
> harmonic:
> you: 2m01.590s .. 0m00.790s
> me: 0m30.811s .. 0m00.120s
>
> harmonic2:
> you: 2m00.340s .. 0m00.440s
> me: 0m30.847s .. 0m00.140s
>
> harmonic3:
> you: 1m44.350s .. 0m00.740s
> me: 0m38.002s .. 0m00.130s
>
> harmonic4:
> you: 1m12.680s .. 0m00.430s
> me: 1m14.603s .. 0m00.220s
>
> So on this system, harmonic4 is by far the slowest, and the fastest
> version is the one that uses float_of_int and tail recursion. It's
> unclear to me how much of this is that the Intel compiler is simply
> better optimized than the PPC compiler.
>
> John.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Caml-list] Ocaml sums the harmonic series -- four ways, four benchmarks: floating point performance
2005-01-13 19:01 ` Will M. Farr
@ 2005-01-13 20:24 ` John Prevost
2005-01-13 20:50 ` Erik de Castro Lopo
0 siblings, 1 reply; 12+ messages in thread
From: John Prevost @ 2005-01-13 20:24 UTC (permalink / raw)
To: Will M. Farr; +Cc: caml-list, shootout-list
There quite possibly is--I could look. But I do believe that the
Intel architecture is best optimized for at least some set of
operations. For example, looking through the assembly source, you'll
notice that it sometimes abuses Intel addressing modes to reduce the
cost of "Caml ints are just like native ints with a 1 in the low bit".
As for whether there's a quick "convert int to float" call in Intel, I
really have no idea. The assembly for the simple function:
let test x = float_of_int x
isn't trivial, however. I have to admit that I don't know the ins and
outs of Intel assembly any further than I have learned them while
trying to optimize specific O'Caml loops. And since I rarely use
floating point, all of these opcodes are greek to me. :) I *think*
it's allocating space in the heap for the float, then filling it in
with a non-normalized value (which is pretty easy, since doubles are
64 bits, and ints are 31 bits), and then saying "normalize this,
please." But I can't say for sure. And since I don't have a PPC
system to play with, I can't compare.
John.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Caml-list] Ocaml sums the harmonic series -- four ways, four benchmarks: floating point performance
2005-01-13 20:24 ` John Prevost
@ 2005-01-13 20:50 ` Erik de Castro Lopo
2005-01-13 21:32 ` Erik de Castro Lopo
0 siblings, 1 reply; 12+ messages in thread
From: Erik de Castro Lopo @ 2005-01-13 20:50 UTC (permalink / raw)
To: caml-list; +Cc: shootout-list
On Thu, 13 Jan 2005 15:24:19 -0500
John Prevost <j.prevost@gmail.com> wrote:
> As for whether there's a quick "convert int to float" call in Intel, I
> really have no idea. The assembly for the simple function:
>
> let test x = float_of_int x
>
> isn't trivial, however.
Int to float should just work. Int to float is another matter. See
this:
http://www.mega-nerd.com/FPcast/
Erik
--
+-----------------------------------------------------------+
Erik de Castro Lopo nospam@mega-nerd.com (Yes it's valid)
+-----------------------------------------------------------+
"Whenever the C++ language designers had two competing ideas as to
how they should solve some problem, they said, "OK, we'll do them
both". So the language is too baroque for my taste." -- Donald E Knuth
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Caml-list] Ocaml sums the harmonic series -- four ways, four benchmarks: floating point performance
2005-01-13 20:50 ` Erik de Castro Lopo
@ 2005-01-13 21:32 ` Erik de Castro Lopo
0 siblings, 0 replies; 12+ messages in thread
From: Erik de Castro Lopo @ 2005-01-13 21:32 UTC (permalink / raw)
To: caml-list
On Fri, 14 Jan 2005 07:50:57 +1100
Erik de Castro Lopo <ocaml-erikd@mega-nerd.com> wrote:
> Int to float should just work.
Yes.
> Int to float is another matter. See
I meant float to int of course.
> this:
>
> http://www.mega-nerd.com/FPcast/
Erik
--
+-----------------------------------------------------------+
Erik de Castro Lopo nospam@mega-nerd.com (Yes it's valid)
+-----------------------------------------------------------+
"I consider C++ the most significant technical hazard to the survival
of your project and do so without apologies." -- Alistair Cockburn
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Caml-list] Ocaml sums the harmonic series -- four ways, four benchmarks: floating point performance
2005-01-13 15:53 Will M. Farr
2005-01-13 17:29 ` [Caml-list] " John Prevost
@ 2005-01-15 11:55 ` Xavier Leroy
2005-01-15 15:49 ` Michal Moskal
2005-01-15 17:13 ` Yaron Minsky
2005-01-23 2:27 ` Oliver Bandel
2 siblings, 2 replies; 12+ messages in thread
From: Xavier Leroy @ 2005-01-15 11:55 UTC (permalink / raw)
To: Will M. Farr; +Cc: shootout-list, caml-list
> Here's an implementation (really 4) along with timing
> information of the "harmonic" benchmark (essentially summing the
> harmonic series) [...]
> Here's the code for the fastest implementation:
The following slight modification of your code generates asm code that
is closest to what a C compiler would produce:
let sum_harmonic5 n =
let sum = ref 1.0 in
let ifloat = ref 2.0 in
for i = 2 to n do
sum := !sum +. 1.0/.(!ifloat);
ifloat := !ifloat +. 1.0
done;
!sum +. 0.0;;
The + 0.0 at the end is ugly but convinces ocamlopt that !sum is best
kept unboxed during the loop.
> (note that I have replaced an int->float conversion for
> each term with a single floating point operation: ifloat := ifloat +.
> 1.0). Since int->float conversions are slow on my machine (PowerBook
> G4)
Right, the PowerPC does not have an int -> float instruction and that
conversion must be performed with a rather expensive sequence of
instructions (for the gory details, see e.g.
http://the.wall.riscom.net/books/proc/ppc/cwg/code3.html#303610).
64-bit PPCs have a dedicated instruction to do this conversion,
showing that the IBM/Motorola people learn from their past mistakes...
For Intel processors, it's the reverse conversion (float -> int) that
is slow. Clearly, the SPEC benchmark doesn't contain much conversions
between floats and ints, otherwise hardware designers would pay more
attention :-)
> this is a big win (about a factor of 2 in time for the C program).
As others have mentioned, this strongly depends on the processor
instruction set and even on the processor model. My own benchmarks
(with your Caml code) give the following results:
PPC G4 (Cube) 1 < 2 < 3 < 4 < 5 speed ratio = 1.5
Xeon 2.8 3 < 4 < 1 = 2 < 5 speed ratio = 1.02
Pentium 4 2.0 3 < 1 < 2 < 4 < 5 speed ratio = 1.2
Athlon XP 1.4 4 < 5 < 3 < 1 < 2 speed ratio = 2.2
where 1, 2, 3, 4, 5 refer to the 5 different functions,
1 < 2 means "1 is slower than 2",
and "speed ratio" is the speed difference between fastest and slowest.
The Xeon case is what I was expecting: the running time is dominated by
the time it takes to do the float divisions, everything else is done in
parallel or takes negligible time, so it doesn't matter much how you
write the code.
The Athlon figures are *very* surprising. It could be the case that
this benchmark falls into a quirk of that (otherwise excellent :-)
processor.
Actually, this often happens with micro-benchmarks: they are so small
and their mix of operations is so unbalanced that they can easily run
into weird processor behaviors. So, don't draw conclusions hastily.
John Prevost asks:
> Is the PowerPC ocamlopt back-end less optimized than the x86?
No, not really. The x86 back-end works harder to work around oddities
in the x86 instruction set (e.g. the lack of floating-point
registers), but that is hardly an optimization, just compensating for
brain damage in the instruction set. Conversely, the PPC back-end
performs basic-block instruction scheduling while the x86 back-end doesn't.
Instruction scheduling helped with early PPC chips (601, 603) but is
largely irrelevant with modern out-of-order PPC implementations.
> Are you sure that there isn't just a built-in
> instruction on the x86 that adds an int to a float?
I think there exists one such instruction, but ocamlopt doesn't use
it, and the Intel optimization manuals recommend to do int->float
conversion followed by float addition instead.
- Xavier Leroy
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Caml-list] Ocaml sums the harmonic series -- four ways, four benchmarks: floating point performance
2005-01-15 11:55 ` Xavier Leroy
@ 2005-01-15 15:49 ` Michal Moskal
2005-01-15 17:13 ` Yaron Minsky
1 sibling, 0 replies; 12+ messages in thread
From: Michal Moskal @ 2005-01-15 15:49 UTC (permalink / raw)
To: Xavier Leroy; +Cc: Will M. Farr, shootout-list, caml-list
On Sat, 15 Jan 2005 12:55:19 +0100, Xavier Leroy <Xavier.Leroy@inria.fr> wrote:
> As others have mentioned, this strongly depends on the processor
> instruction set and even on the processor model. My own benchmarks
> (with your Caml code) give the following results:
>
> PPC G4 (Cube) 1 < 2 < 3 < 4 < 5 speed ratio = 1.5
> Xeon 2.8 3 < 4 < 1 = 2 < 5 speed ratio = 1.02
> Pentium 4 2.0 3 < 1 < 2 < 4 < 5 speed ratio = 1.2
> Athlon XP 1.4 4 < 5 < 3 < 1 < 2 speed ratio = 2.2
I tested it on Athlon 64 3000+ using both 32bit and 64bit compilers,
the results:
32bit: 4 = 5 < 3 < 1 = 2, speed ratio 2.2
64bit: 3 < 1 = 2 = 4 < 5, speed ratio 1.15
Difference between 64 and 32 bit version (best cases) is 1.30 (64 is faster).
All tests were performed using ocaml 3.07.
> The Athlon figures are *very* surprising. It could be the case that
> this benchmark falls into a quirk of that (otherwise excellent :-)
> processor.
So I guess in 32 bit mode it remains the same on newer athlons.
--
: Michal Moskal :: http://nemerle.org/~malekith/ :: GCS !tv h e>+++ b++
: No, I will *not* fix your computer............ :: UL++++$ C++ E--- a?
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Caml-list] Ocaml sums the harmonic series -- four ways, four benchmarks: floating point performance
2005-01-15 11:55 ` Xavier Leroy
2005-01-15 15:49 ` Michal Moskal
@ 2005-01-15 17:13 ` Yaron Minsky
1 sibling, 0 replies; 12+ messages in thread
From: Yaron Minsky @ 2005-01-15 17:13 UTC (permalink / raw)
To: Xavier Leroy; +Cc: Will M. Farr, shootout-list, caml-list
On Sat, 15 Jan 2005 12:55:19 +0100, Xavier Leroy <Xavier.Leroy@inria.fr> wrote:
> The following slight modification of your code generates asm code that
> is closest to what a C compiler would produce:
>
> let sum_harmonic5 n =
> let sum = ref 1.0 in
> let ifloat = ref 2.0 in
> for i = 2 to n do
> sum := !sum +. 1.0/.(!ifloat);
> ifloat := !ifloat +. 1.0
> done;
> !sum +. 0.0;;
>
> The + 0.0 at the end is ugly but convinces ocamlopt that !sum is best
> kept unboxed during the loop.
That last comment is very interesting and surprising to me. I've
looked over the optimization suggestions for the compiler, and I don't
understand why that last +. convinces the compiler to unbox sum. Can
you explain why that is? Floating point performance is important to
me, and I'd like to get a better grasp on it.
(As a general matter, it would be nice to have some tools to
understand things like unboxing and inlining a little better. For
example, it would be great to have something akin to -dtypes that
outputs information with which one could check whether a certain
function call is inlined, or whether a certain float is unboxed.)
y
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Caml-list] Ocaml sums the harmonic series -- four ways, four benchmarks: floating point performance
2005-01-13 15:53 Will M. Farr
2005-01-13 17:29 ` [Caml-list] " John Prevost
2005-01-15 11:55 ` Xavier Leroy
@ 2005-01-23 2:27 ` Oliver Bandel
2005-01-23 6:07 ` Will M. Farr
2 siblings, 1 reply; 12+ messages in thread
From: Oliver Bandel @ 2005-01-23 2:27 UTC (permalink / raw)
To: caml-list
On Thu, Jan 13, 2005 at 10:53:16AM -0500, Will M. Farr wrote:
[...]
> Here's the code for the fastest implementation:
>
> let sum_harmonic4 n =
> let sum = ref 1.0 in
> let ifloat = ref 2.0 in
> for i = 2 to n do
> sum := !sum +. 1.0/.(!ifloat);
> ifloat := !ifloat +. 1.0
> done;
> !sum;;
>
> let _ =
> let n = int_of_string (Sys.argv.(1)) in
> Printf.printf "%g\n" (sum_harmonic4 n);;
I tried harmonic4 on Powerbook G4, 400 MHz and the
native-code needs about 1 min 50s.
The Bytecode for harmonic4 runs in about 1min 53 s.
It seems that there is no real distinction between
bytecode and native code. At least on that system,
#or at least on that task.
I use Panther OS. It seems that it's more than twice as fast as your OS
(look at the processor frequency: 400 MHz on my PB G4, 800 MHz on yours...).
Which OS are you running? An older version of Mac-OS-X? Or Linux? (which one?)
Maybe you can speed-up your calculations a lot, when installing a different
operating system on your computer.
I didn't try the other implementations.
IMHO you can gain more performance easier, when
changing your OS. Easier than looking at some code optimizations...?!
(which you nevertheless can do too)
Ciao,
Oliver
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Caml-list] Ocaml sums the harmonic series -- four ways, four benchmarks: floating point performance
2005-01-23 2:27 ` Oliver Bandel
@ 2005-01-23 6:07 ` Will M. Farr
2005-01-23 15:18 ` Oliver Bandel
0 siblings, 1 reply; 12+ messages in thread
From: Will M. Farr @ 2005-01-23 6:07 UTC (permalink / raw)
To: Oliver Bandel; +Cc: caml-list
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
I'm running 10.3.7 -- I don't think there's any newer version. When I
run harmonic4 as follows:
time ./harmonic 1000000000
21.3005
real 1m3.764s
user 1m0.590s
sys 0m0.130s
the above is what I get. I'm not sure why I'm not exactly 2x faster
than you, but there's plenty of things which could affect that.
Running the bytecode on my system gives:
time ./harmonic.bc 1000000000
21.3005
real 11m51.239s
user 11m11.600s
sys 0m0.940s
I would be pretty surprised to see the bytecode come even close to the
native code version --- are you sure about the numbers on your system?
Will
On 22 Jan 2005, at 9:27 PM, Oliver Bandel wrote:
> I tried harmonic4 on Powerbook G4, 400 MHz and the
> native-code needs about 1 min 50s.
>
> The Bytecode for harmonic4 runs in about 1min 53 s.
>
> It seems that there is no real distinction between
> bytecode and native code. At least on that system,
> #or at least on that task.
>
>
> I use Panther OS. It seems that it's more than twice as fast as your OS
> (look at the processor frequency: 400 MHz on my PB G4, 800 MHz on
> yours...).
>
> Which OS are you running? An older version of Mac-OS-X? Or Linux?
> (which one?)
>
> Maybe you can speed-up your calculations a lot, when installing a
> different
> operating system on your computer.
>
> I didn't try the other implementations.
> IMHO you can gain more performance easier, when
> changing your OS. Easier than looking at some code optimizations...?!
> (which you nevertheless can do too)
>
>
> Ciao,
> Oliver
>
> _______________________________________________
> Caml-list mailing list. Subscription management:
> http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
> Archives: http://caml.inria.fr
> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
> Bug reports: http://caml.inria.fr/bin/caml-bugs
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.0 (Darwin)
iD8DBQFB8z8qjFCrhUweU3MRAn4FAKCM9oHCU3l/RY/Bm1+/3PzOiGPcSQCcCIku
3XIQ3tXUQQwtNPEfUzZoU3E=
=ivpj
-----END PGP SIGNATURE-----
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Caml-list] Ocaml sums the harmonic series -- four ways, four benchmarks: floating point performance
2005-01-23 6:07 ` Will M. Farr
@ 2005-01-23 15:18 ` Oliver Bandel
0 siblings, 0 replies; 12+ messages in thread
From: Oliver Bandel @ 2005-01-23 15:18 UTC (permalink / raw)
To: Will M. Farr; +Cc: caml-list
On Sun, Jan 23, 2005 at 01:07:30AM -0500, Will M. Farr wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> I'm running 10.3.7 -- I don't think there's any newer version. When I
> run harmonic4 as follows:
>
> time ./harmonic 1000000000
> 21.3005
>
> real 1m3.764s
> user 1m0.590s
> sys 0m0.130s
>
> the above is what I get. I'm not sure why I'm not exactly 2x faster
> than you, but there's plenty of things which could affect that.
>
> Running the bytecode on my system gives:
>
> time ./harmonic.bc 1000000000
> 21.3005
>
> real 11m51.239s
> user 11m11.600s
> sys 0m0.940s
>
> I would be pretty surprised to see the bytecode come even close to the
> native code version --- are you sure about the numbers on your system?
No, not more!
I have used the wrong binary! :(
I thought I had the same names for the executables, after recompiling
them for the test, but the native-code had a different name and so I called
the same file twice! :(
Sorry, I'm really chaotic these days! :(
Ciao,
Oliver
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2005-01-23 15:18 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-01-16 9:57 [Caml-list] Ocaml sums the harmonic series -- four ways, four benchmarks: floating point performance Philippe Lelédy
-- strict thread matches above, loose matches on Subject: below --
2005-01-13 15:53 Will M. Farr
2005-01-13 17:29 ` [Caml-list] " John Prevost
2005-01-13 19:01 ` Will M. Farr
2005-01-13 20:24 ` John Prevost
2005-01-13 20:50 ` Erik de Castro Lopo
2005-01-13 21:32 ` Erik de Castro Lopo
2005-01-15 11:55 ` Xavier Leroy
2005-01-15 15:49 ` Michal Moskal
2005-01-15 17:13 ` Yaron Minsky
2005-01-23 2:27 ` Oliver Bandel
2005-01-23 6:07 ` Will M. Farr
2005-01-23 15:18 ` Oliver Bandel
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox