* [Caml-list] speed, loops vs. hofs
@ 2003-04-28 17:13 Hal Daume III
2003-04-29 14:39 ` [Caml-list] " Andrzej M. Ostruszka
2003-04-30 0:45 ` [Caml-list] " Manos Renieris
0 siblings, 2 replies; 3+ messages in thread
From: Hal Daume III @ 2003-04-28 17:13 UTC (permalink / raw)
To: Caml Mailing List
One of the primary reasons I use OCaml is because it offers speed,
together with nice functional stuff, like HOFs. However, it seems that
using these together might be impossible.
I have a module which allows iteration over a particular datatype. I have
a method of doing this iteration very quickly using for loops. In
particular, I have a function of type:
loopPre : hmm -> node -> (node -> int -> bool -> unit) -> unit
which loops over the predecessors of the given node in the HMM. The
implementation basically looks like:
let loopPre hmm node action =
let ... some definitions ...
begin
action {...make the node...} 0 (...make the bool...);
for i = 0 to num_nodes do
action {...make the node...} i (...make the bool...);
for j = 0 to {something} do
action {...make the node...} j (...make the bool...);
done;
done;
basically, there are a few calls to 'action' in a pair of nested
loops. Then, at the usage point, I do something like:
...1...
loopPre hmm node_j
(fun nd len null ->
...2...
)
...3...
I had a sneaking feeling that if I actually stuck the for loops into the
original code I would get speedups. However, "...2..." is pretty long, so
there were be (a) a bunch of code duplication and (b) it would be hard to
maintain.
In order to test this, I wrote the following program:
% cat loops.ml
let loop low high f =
begin
f 0 true;
for i=low to high do
f i false;
done;
f (-1) true;
end
% cat useLoops.ml
open Loops
let top = 500000000
let cnt = ref 0
let pre1 = Sys.time ()
let _ = loop 1 top (fun i b -> if not b then cnt := !cnt + 1)
let pst1 = Sys.time ()
let cnt = ref 0
let pre2 = Sys.time ()
let _ =
let f i b = if not b then cnt := !cnt + 1 in
loop 1 top f
let pst2 = Sys.time ()
let cnt = ref 0
let pre3 = Sys.time ()
let _ =
begin
let i = 0 in let b = true in
if not b then cnt := !cnt + 1;
for i=1 to top do
let b = false in
if not b then cnt := !cnt + 1;
done;
let i = -1 in let b = true in
if not b then cnt := !cnt + 1;
end
let pst3 = Sys.time ()
let cnt = ref 0
let pre4 = Sys.time ()
let _ =
let f i b = if not b then cnt := !cnt + 1 in
begin
let i = 0 in let b = true in
f i b;
for i=1 to top do
let b = false in
f i b;
done;
let i = -1 in let b = true in
f i b;
end
let pst4 = Sys.time ()
let _ = Printf.printf "Time 1 = %f sec\n" (pst1-.pre1)
let _ = Printf.printf "Time 2 = %f sec\n" (pst2-.pre2)
let _ = Printf.printf "Time 3 = %f sec\n" (pst3-.pre3)
let _ = Printf.printf "Time 4 = %f sec\n" (pst4-.pre4)
%
Here, there are four loops done. The first two use the HO loop function
from the Loops module (which is very much like my iterPre function); the
last two use loops. One of them inserts the funciton definition at every
usage; the second does a let-bound definition.
For loops of 500 million elements, the timings come out:
Time 1 = 10.230000 sec
Time 2 = 10.230000 sec
Time 3 = 2.610000 sec
Time 4 = 7.020000 sec
on my x86 linux box.
I find these numbers very disturbing. First, just looking at the
discrepency between 3 and 4. I'd imagine this happens because in (3), the
"let b = false in if not b..." gets folded down to just "cnt := !cnt + 1".
To test, this I also ran:
let _ =
let f i b = cnt := !cnt + 1 in
let g i b = () in
begin
let i = 0 in let b = true in
g i b;
for i=1 to top do
let b = false in
f i b;
done;
let i = -1 in let b = true in
g i b;
end
where the unfolding is done manually. This implementation came in at:
Time 5 = 3.570000 sec
This seems to imply that I will triple the speed of my program by making
it (a) non-functional and (b) unreadable. Is this true? Is there another
way to get around this problem?
- Hal
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
^ permalink raw reply [flat|nested] 3+ messages in thread
* [Caml-list] Re: speed, loops vs. hofs
2003-04-28 17:13 [Caml-list] speed, loops vs. hofs Hal Daume III
@ 2003-04-29 14:39 ` Andrzej M. Ostruszka
2003-04-30 0:45 ` [Caml-list] " Manos Renieris
1 sibling, 0 replies; 3+ messages in thread
From: Andrzej M. Ostruszka @ 2003-04-29 14:39 UTC (permalink / raw)
To: Caml Mailing List
On Mon, Apr 28 (2003), Hal Daume III wrote:
[...]
> I find these numbers very disturbing. First, just looking at the
> discrepency between 3 and 4. I'd imagine this happens because in (3), the
> "let b = false in if not b..." gets folded down to just "cnt := !cnt + 1".
<< Disclaimer: This is an answer of a newbie :)) >>
I haven't checked the assembler output but it looks like the
> let f i b = if not b then cnt := !cnt + 1 in
is not inlined in the 4th version -- a bit surprising that default
inlining level is that conservative.
I've cut&pasted the 3rd and 4th functions into the test.ml:
[amo@order ostruszk]$ ocamlopt test.ml
[amo@order ostruszk]$ ./a.out
Time 3 = 1.050000 sec
Time 4 = 3.090000 sec
[amo@order ostruszk]$ ocamlopt -inline 2 test.ml
[amo@order ostruszk]$ ./a.out
Time 3 = 1.050000 sec
Time 4 = 1.050000 sec
As to the module version: it is impossible to inline unknown function.
Functions passed in arguments are unknown to the compiler -- unless
you've got some "whole program analyser" that checks every use of your
loop function -- so every call will be done via the function pointer.
I hope I haven't messed up anything and this will be of any help to you
:)
Best regards
--
____ _ ___
/ | \_/ |/ _ \ Andrzej Marek Ostruszka
/ _ | | (_) | Instytut Fizyki, Uniwersytet Jagiellonski (Cracow)
/_/ L|_|V|_|\___/ (PGP <-- finger ostruszk@order.if.uj.edu.pl)
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [Caml-list] speed, loops vs. hofs
2003-04-28 17:13 [Caml-list] speed, loops vs. hofs Hal Daume III
2003-04-29 14:39 ` [Caml-list] " Andrzej M. Ostruszka
@ 2003-04-30 0:45 ` Manos Renieris
1 sibling, 0 replies; 3+ messages in thread
From: Manos Renieris @ 2003-04-30 0:45 UTC (permalink / raw)
To: Hal Daume III; +Cc: Caml Mailing List
On Mon, Apr 28, 2003 at 10:13:17AM -0700, Hal Daume III wrote:
> One of the primary reasons I use OCaml is because it offers speed,
> together with nice functional stuff, like HOFs. However, it seems that
> using these together might be impossible.
>
> I have a module which allows iteration over a particular datatype. I have
> a method of doing this iteration very quickly using for loops. In
> particular, I have a function of type:
>
> loopPre : hmm -> node -> (node -> int -> bool -> unit) -> unit
>
[...]
> Then, at the usage point, I do something like:
>
> ...1...
> loopPre hmm node_j
> (fun nd len null ->
> ...2...
> )
> ...3...
>
> I had a sneaking feeling that if I actually stuck the for loops into the
> original code I would get speedups. However, "...2..." is pretty long, so
If "...2..." is long, then the cost of the function call is going to
insignificant wrt the cost of running "...2...". In the examples you
wrote, "...2..." is very short, so its cost is comparable to the cost of
function calls, except ocamlopt is smarter than that and inlines most of them.
There is a chance that after the inlinining there is enough context to
optimize the inlined function bodies further, but I don't know the
innards of the compiler well enough to say if this happens or not.
-- Manos
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2003-04-30 0:45 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-04-28 17:13 [Caml-list] speed, loops vs. hofs Hal Daume III
2003-04-29 14:39 ` [Caml-list] " Andrzej M. Ostruszka
2003-04-30 0:45 ` [Caml-list] " Manos Renieris
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox