* Re: HLVM ray tracer performance
@ 2010-01-10 18:29 shawjef3
2010-01-10 20:14 ` [Caml-list] " Jon Harrop
0 siblings, 1 reply; 7+ messages in thread
From: shawjef3 @ 2010-01-10 18:29 UTC (permalink / raw)
To: caml-list
[-- Attachment #1: Plaintext Version of Message --]
[-- Type: text/plain, Size: 1645 bytes --]
Jon,
I wanted to run the raytracing benchmark myself to see if Haskell really was that slow. I'm using ghc 6.10 because that's what ubuntu comes with. I don't know if ghc 6.12 generates slower executables than 6.10 or what else might be going on. I ran each several times and the numbers I pasted are typical (+/- 0.2 seconds, say).
jeff@ubuntu:~/Desktop$ ghc --version
The Glorious Glasgow Haskell Compilation System, version 6.10.4
jeff@ubuntu:~/Desktop$ g++ --version
g++ (Ubuntu 4.4.1-4ubuntu8) 4.4.1
Copyright (C) 2009 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
jeff@ubuntu:~/Desktop$ ocamlopt -v
The Objective Caml native-code compiler, version 3.11.1
Standard library directory: /usr/lib/ocaml
I compiled the raytracers for c++, haskell and ocaml from
http://www.ffconsultancy.com/languages/ray_tracer/code/5
and used the compile instructions at
http://www.ffconsultancy.com/languages/ray_tracer/benchmark.html
though I had to change the haskell one to use just ghc instead of specifying a version. I also ran the ocaml and haskell code in the 1/ directory, and they completed within 0.1 seconds of each other.
c++
jeff@ubuntu:~/Desktop$ time ./ray 9 512 > /dev/null
real 0m3.515s
user 0m3.440s
sys 0m0.016s
haskell
jeff@ubuntu:~/Desktop$ time ./ray 9 512 > /dev/null
real 0m5.811s
user 0m5.752s
sys 0m0.032s
ocaml
jeff@ubuntu:~/Desktop$ time ./ray 9 512 > /dev/null
real 0m6.572s
user 0m6.544s
sys 0m0.016s
Jeff
[-- Attachment #2: HTML Version of Message --]
[-- Type: text/html, Size: 1936 bytes --]
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Caml-list] Re: HLVM ray tracer performance
2010-01-10 18:29 HLVM ray tracer performance shawjef3
@ 2010-01-10 20:14 ` Jon Harrop
2010-01-10 20:37 ` Richard Jones
2010-01-11 0:47 ` Jeff Shaw
0 siblings, 2 replies; 7+ messages in thread
From: Jon Harrop @ 2010-01-10 20:14 UTC (permalink / raw)
To: caml-list; +Cc: shawjef3
On Sunday 10 January 2010 18:29:42 shawjef3@msu.edu wrote:
> Jon,
>
> I wanted to run the raytracing benchmark myself to see if Haskell really
> was that slow. I'm using ghc 6.10 because that's what ubuntu comes with.
> I don't know if ghc 6.12 generates slower executables than 6.10 or what
> else might be going on.
I used GHC 6.12 with --make -O2 to get the results from the recent article
because it generated results faster than GHC 6.10. However, I failed to
detect that only the Haskell was generating garbage output. Rerunning the
benchmark with GHC 6.10 here, Haskell does give the correct answer but the
times are even worse than those I quoted.
> I ran each several times and the numbers I pasted
> are typical (+/- 0.2 seconds, say).
>
> jeff@ubuntu:~/Desktop$ ghc --version
> The Glorious Glasgow Haskell Compilation System, version 6.10.4
> jeff@ubuntu:~/Desktop$ g++ --version
> g++ (Ubuntu 4.4.1-4ubuntu8) 4.4.1
> Copyright (C) 2009 Free Software Foundation, Inc.
> This is free software; see the source for copying conditions. There is NO
> warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
> jeff@ubuntu:~/Desktop$ ocamlopt -v
> The Objective Caml native-code compiler, version 3.11.1
> Standard library directory: /usr/lib/ocaml
I used g++ 4.3.3 and OCaml 3.11.1 on a 64-bit Linux kernel running 32-bit
userland. The machine is an 8-core with two Quad-Core AMD Opteron(tm) 2352
Processors running at 2.1GHz. AFAICT they have 512kb L2 caches each and 2Mb
L3 caches per quadcore CPU.
> I compiled the raytracers for c++, haskell and ocaml from
>
> http://www.ffconsultancy.com/languages/ray_tracer/code/5
>
> and used the compile instructions at
>
> http://www.ffconsultancy.com/languages/ray_tracer/benchmark.html
>
> though I had to change the haskell one to use just ghc instead of
> specifying a version. I also ran the ocaml and haskell code in the 1/
> directory, and they completed within 0.1 seconds of each other.
>
> c++
> jeff@ubuntu:~/Desktop$ time ./ray 9 512 > /dev/null
>
> real 0m3.515s
> user 0m3.440s
> sys 0m0.016s
>
> haskell
> jeff@ubuntu:~/Desktop$ time ./ray 9 512 > /dev/null
>
> real 0m5.811s
> user 0m5.752s
> sys 0m0.032s
>
> ocaml
> jeff@ubuntu:~/Desktop$ time ./ray 9 512 > /dev/null
>
> real 0m6.572s
> user 0m6.544s
> sys 0m0.016s
Are you running x64 or on Intel hardware? What results do you get for 12, 13
or 14 instead of 9?
--
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?e
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Caml-list] Re: HLVM ray tracer performance
2010-01-10 20:14 ` [Caml-list] " Jon Harrop
@ 2010-01-10 20:37 ` Richard Jones
2010-01-11 11:03 ` Jon Harrop
2010-01-11 0:47 ` Jeff Shaw
1 sibling, 1 reply; 7+ messages in thread
From: Richard Jones @ 2010-01-10 20:37 UTC (permalink / raw)
To: Jon Harrop; +Cc: caml-list, shawjef3
On Sun, Jan 10, 2010 at 08:14:29PM +0000, Jon Harrop wrote:
> on a 64-bit Linux kernel running 32-bit userland
I'm assuming you mean x86 (not eg ppc64), in which case that's a very
unusual choice. Any reason for this?
Rich.
--
Richard Jones
Red Hat
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Caml-list] Re: HLVM ray tracer performance
2010-01-10 20:14 ` [Caml-list] " Jon Harrop
2010-01-10 20:37 ` Richard Jones
@ 2010-01-11 0:47 ` Jeff Shaw
2010-01-11 10:48 ` Jon Harrop
1 sibling, 1 reply; 7+ messages in thread
From: Jeff Shaw @ 2010-01-11 0:47 UTC (permalink / raw)
To: Jon Harrop; +Cc: caml-list
> Are you running x64 or on Intel hardware? What results do you get for 12, 13
> or 14 instead of 9?
>
>
I am running an AMD Phenom 9950, but the Ubuntu I'm using is just
32-bit. I tried 5/ray.hs with level=12 instead of 9 but it ran into a
stack overflow problem. When I increased the stack size it completed but
it also took more time than 1/ray.hs, which required no stack size
increase. I made sure that the other arguments I fed it were the same. I
think there is some problem that needs to be worked out in the 5/ray.hs.
Maybe the problem is in ghc, I'm not sure. Below, ./ray5 is 5/ray.hs,
and ./ray is 1/ray.hs
jeff@ubuntu:~/Desktop$ time ./ray 12 512 > /dev/null
real 0m21.479s
user 0m21.093s
sys 0m0.180s
jeff@ubuntu:~/Desktop$ time ./ray5 12 512 +RTS -K2000000000 > /dev/null
real 0m28.366s
user 0m25.674s
sys 0m2.608s
jeff@ubuntu:~/Desktop$ time ./ray 14 512 > /dev/null
real 0m23.544s
user 0m23.021s
sys 0m0.500s
I tried level=14 but I ran out of memory for 5/ray.ml and 5/ray.hs.
I considered that maybe I had saved the files from your website wrong,
or mixed them up during compilation. So I ran the timer again with
level=9 and level=12 and got all the same results. That is, level=9 is
faster on 5/ray.hs but level=12 is faster with 1/ray.hs. So I don't
think I'm making a simple manual labor error.
It seems that 5/ray.ml and 5/ray.hs aren't quite equivalent in some
important way since 1/ray.ml is faster than 5/ray.ml for both level=9
and level=12. Whether it's a code problem or compiler problem, I cannot say.
The stack size problem does not go away when I remove all the extra
optimization arguments to ghc.
--Jeff
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Caml-list] Re: HLVM ray tracer performance
2010-01-11 0:47 ` Jeff Shaw
@ 2010-01-11 10:48 ` Jon Harrop
0 siblings, 0 replies; 7+ messages in thread
From: Jon Harrop @ 2010-01-11 10:48 UTC (permalink / raw)
To: caml-list; +Cc: Jeff Shaw
On Monday 11 January 2010 00:47:26 Jeff Shaw wrote:
> > Are you running x64 or on Intel hardware? What results do you get for 12,
> > 13 or 14 instead of 9?
>
> I am running an AMD Phenom 9950, but the Ubuntu I'm using is just
> 32-bit given that we're running the same architecture.
Then I'm even more surprised that you would see significantly different
results to mine.
> I tried 5/ray.hs with level=12 instead of 9 but it ran into a
> stack overflow problem.
Yes. Many of the Haskell versions regularly die with stack overflows. They are
not predictable.
> When I increased the stack size it completed but
> it also took more time than 1/ray.hs, which required no stack size
> increase.
This is an interesting result. I hadn't noticed that the most optimized
Haskell implementation is not necessarily the fastest. However, I think I can
explain the phenomenon: with a huge number of spheres, some groups of spheres
(branches of scene tree) are always occluded and never need to be explicitly
generated but only the Haskell is generating the scene tree lazily. In fact,
it may be the case that with level->infinity only the Haskell required
bounded space.
For example, at level=13 the 1/ray.hs Haskell takes 25.8s, 2/ray.hs takes 93s
and the 5/ray.ml OCaml takes 118s. Presumably Lennart made the more optimized
Haskell implementations eager in order to improve performance at level=9 but,
in doing so, he degraded performance for level>9.
Unpredictable...
> I made sure that the other arguments I fed it were the same. I
> think there is some problem that needs to be worked out in the 5/ray.hs.
There is no easy solution to this because the performance is a non-trivial
function of "level" and "n".
> I tried level=14 but I ran out of memory for 5/ray.ml and 5/ray.hs.
But 1/ray.hs can handle level=14 and 15:
$ time ./ray 14 512 >image.pgm
real 0m27.581s
user 0m26.790s
sys 0m0.764s
$ time ./ray 15 512 >image.pgm
real 0m29.532s
user 0m28.982s
sys 0m0.552s
In fact, that is faster than any other version.
> It seems that 5/ray.ml and 5/ray.hs aren't quite equivalent in some
> important way since 1/ray.ml is faster than 5/ray.ml for both level=9
> and level=12.
Did you mean .hs instead of .ml here?
> Whether it's a code problem or compiler problem, I cannot
> say.
The relative performance of the Haskell implementations also varies with
compiler versions, of course. I cannot tell when it will run out of memory or
even out of stack space. You just have to try it and, when Haskell dies with
a stack overflow after several minutes, you just have to tweak the
command-line parameters to try again until it happens to work.
Finally, I'd add that this "benefit" of the Haskell will almost certainly
destroy its scalability in the parallel case because you'll have threads
competing to force the evaluation of thunks in the shared scene tree which
incurs global synchronization in wholly unpredictable ways (it even depends
upon the layout of the scene!). So, while this is academically interesting,
I'd argue that it is practically useless.
--
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?e
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Caml-list] Re: HLVM ray tracer performance
2010-01-10 20:37 ` Richard Jones
@ 2010-01-11 11:03 ` Jon Harrop
0 siblings, 0 replies; 7+ messages in thread
From: Jon Harrop @ 2010-01-11 11:03 UTC (permalink / raw)
To: caml-list
Richard asked me to draw a comparison on 64-bit as well because OCaml
sometimes does relatively better there. With level=13, I get:
OCaml 32-bit: 118s
OCaml 64-bit: 95s
HLVM 32-bit: 34.8s
HLVM 64-bit: 30.4s
--
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?e
^ permalink raw reply [flat|nested] 7+ messages in thread
* HLVM ray tracer performance
@ 2010-01-08 14:53 Jon Harrop
0 siblings, 0 replies; 7+ messages in thread
From: Jon Harrop @ 2010-01-08 14:53 UTC (permalink / raw)
To: caml-list
I just published results for the ray tracer benchmark written in HLVM and
compared to other languages including OCaml:
http://flyingfrogblog.blogspot.com/2010/01/hlvm-on-ray-tracer-language-comparison.html
Note that these results were obtained with HLVM's multicore garbage collector
enabled.
--
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?e
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2010-01-11 9:48 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-01-10 18:29 HLVM ray tracer performance shawjef3
2010-01-10 20:14 ` [Caml-list] " Jon Harrop
2010-01-10 20:37 ` Richard Jones
2010-01-11 11:03 ` Jon Harrop
2010-01-11 0:47 ` Jeff Shaw
2010-01-11 10:48 ` Jon Harrop
-- strict thread matches above, loose matches on Subject: below --
2010-01-08 14:53 Jon Harrop
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox