* best and fastest way to read lines from a file?
@ 2007-10-01 21:27 YC
2007-10-01 21:55 ` [Caml-list] " Daniel Bünzli
2007-10-01 21:55 ` Olivier Roussel
0 siblings, 2 replies; 17+ messages in thread
From: YC @ 2007-10-01 21:27 UTC (permalink / raw)
To: caml-list
[-- Attachment #1: Type: text/plain, Size: 1279 bytes --]
Hi all -
Newbie question: I'm wondering what's the most efficient way to read in a
file line by line? I wrote a routine in both python and ocaml to read in a
file with 345K lines to do line count and was surprised that python's code
run roughly 3x faster.
I thought the speed should be equivalent and/or somewhat in ocaml favor,
given this is an IO-bound comparison, but perhaps Python's simplistic for
loop have a read-ahead buffer built-in, and perhaps ocaml's input channel is
unbuffered, but I'm not sure how to write a buffered code that can do a line
by line read-in.
Any insight is appreciated, thanks ;)
yc
Python code:
# test.py
#!/usr/bin/python
file = <345k-line.txt>
count = 0
for line in open (file, "r"):
count = count + 1
print "Done: ", count
OCaml code:
(* test.ml *)
let rec line_count filename =
let f = open_in filename in
let rec loop file count =
try
ignore (input_line file);
loop file (count + 1)
with
End_of_file -> count
in
loop f 0;;
let count = line_count <345k-line.txt> in
Printf.printf "Done: %d" count;;
Test
$ time ./test.py
Done: 345001
real 0m0.416s
user 0m0.101s
sys 0m0.247s
$ ocamlopt -o test test.ml
$ time ./test
Done: 345001
real 0m1.483s
user 0m0.631s
sys 0m0.685s
[-- Attachment #2: Type: text/html, Size: 1844 bytes --]
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [Caml-list] best and fastest way to read lines from a file?
2007-10-01 21:27 best and fastest way to read lines from a file? YC
@ 2007-10-01 21:55 ` Daniel Bünzli
2007-10-01 22:29 ` YC
2007-10-01 21:55 ` Olivier Roussel
1 sibling, 1 reply; 17+ messages in thread
From: Daniel Bünzli @ 2007-10-01 21:55 UTC (permalink / raw)
To: caml-list List
Le 1 oct. 07 à 23:27, YC a écrit :
> OCaml code:
> (* test.ml *)
> let rec line_count filename =
> let f = open_in filename in
> let rec loop file count =
> try
> ignore (input_line file);
> loop file (count + 1)
> with
> End_of_file -> count
> in
> loop f 0;;
Your function is not tail recursive. A function is tail-recursive if
there is nothing to do after the recursive call. You might believe
this is the case in your function, but it is not the case because of
the exception handler. Try this instead :
let rec line_count filename =
let ic = open_in filename in
let rec loop ic count =
let line = try Some (input_line ic) with End_of_file -> None in
match line with
| Some _ -> loop ic (count+1)
| None -> count
in
loop ic 0
It should run faster.
Daniel
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [Caml-list] best and fastest way to read lines from a file?
2007-10-01 21:27 best and fastest way to read lines from a file? YC
2007-10-01 21:55 ` [Caml-list] " Daniel Bünzli
@ 2007-10-01 21:55 ` Olivier Roussel
2007-10-02 12:39 ` Mattias Engdegård
1 sibling, 1 reply; 17+ messages in thread
From: Olivier Roussel @ 2007-10-01 21:55 UTC (permalink / raw)
To: YC; +Cc: caml-list
YC a écrit :
> Hi all -
Hi!
> OCaml code:
> (* test.ml <http://test.ml> *)
> let rec line_count filename =
> let f = open_in filename in
> let rec loop file count =
> try
> ignore (input_line file);
> loop file (count + 1)
> with
> End_of_file -> count
> in
> loop f 0;;
>
> let count = line_count <345k-line.txt> in
> Printf.printf "Done: %d" count;;
The following solution is ~2.5x faster than the Python implementation on
my computer. Because there is no more exceptions in recursive calls, and
thanks to tail-recursion.
let readline f =
try Some (input_line f)
with End_of_file -> None;;
let line_count filename =
let f = open_in filename in
let rec loop count =
match (readline f) with
| Some(_) -> loop (count+1)
| None -> count in
loop 0;;
let count = line_count <345k-line.txt> in
Printf.printf "Done: %d\n" count;;
--
Olivier Roussel
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [Caml-list] best and fastest way to read lines from a file?
2007-10-01 21:55 ` [Caml-list] " Daniel Bünzli
@ 2007-10-01 22:29 ` YC
0 siblings, 0 replies; 17+ messages in thread
From: YC @ 2007-10-01 22:29 UTC (permalink / raw)
To: Daniel Bünzli; +Cc: caml-list List
[-- Attachment #1: Type: text/plain, Size: 573 bytes --]
Thanks to everyone replied - your answers are extremely helpful.
> Your function is not tail recursive. A function is tail-recursive if
> there is nothing to do after the recursive call. You might believe
> this is the case in your function, but it is not the case because of
> the exception handler. Try this instead :
I did wonder whether my function was tail recursive but thought maybe it
is. Thanks for the correction from everyone on this point. And the pattern
of returning Some or None makes sense to me now. Rerun the test makes me
feel better ;)
Thanks,
yc
[-- Attachment #2: Type: text/html, Size: 663 bytes --]
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [Caml-list] best and fastest way to read lines from a file?
2007-10-01 21:55 ` Olivier Roussel
@ 2007-10-02 12:39 ` Mattias Engdegård
2007-10-02 12:56 ` Brian Hurt
0 siblings, 1 reply; 17+ messages in thread
From: Mattias Engdegård @ 2007-10-02 12:39 UTC (permalink / raw)
To: olivier.roussel; +Cc: yinso.chen, caml-list
>let line_count filename =
> let f = open_in filename in
> let rec loop count =
> match (readline f) with
> | Some(_) -> loop (count+1)
> | None -> count in
> loop 0;;
Something like this should be even faster:
exception Done of int
let line_count file =
let rec loop count =
let _ =
try
input_line f
with End_of_file -> raise (Done count)
in
loop (count + 1)
in
try loop 0 with Done x -> x
as it avoids unnecessary consing in the inner loop.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [Caml-list] best and fastest way to read lines from a file?
2007-10-02 12:39 ` Mattias Engdegård
@ 2007-10-02 12:56 ` Brian Hurt
2007-10-02 16:15 ` kirillkh
0 siblings, 1 reply; 17+ messages in thread
From: Brian Hurt @ 2007-10-02 12:56 UTC (permalink / raw)
To: caml-list
[-- Attachment #1: Type: text/plain, Size: 939 bytes --]
Mattias Engdegård wrote:
>>let line_count filename =
>> let f = open_in filename in
>> let rec loop count =
>> match (readline f) with
>> | Some(_) -> loop (count+1)
>> | None -> count in
>> loop 0;;
>>
>>
>
>Something like this should be even faster:
>
>exception Done of int
>
>let line_count file =
> let rec loop count =
> let _ =
> try
> input_line f
> with End_of_file -> raise (Done count)
> in
> loop (count + 1)
> in
> try loop 0 with Done x -> x
>
>as it avoids unnecessary consing in the inner loop.
>
>__
>
This should be a FAQ.
let line_count file =
let rec loop count =
let test =
try
let _ = input_line f in
true
with
| Not_found -> false
in
if test then
loop (count + 1)
else
count
in
loop 0
;;
No consing, no unnecessary try/catch, tail recursive.
Brian
[-- Attachment #2: Type: text/html, Size: 1845 bytes --]
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [Caml-list] best and fastest way to read lines from a file?
2007-10-02 12:56 ` Brian Hurt
@ 2007-10-02 16:15 ` kirillkh
2007-10-02 17:10 ` verlyck, Bruno.Verlyck
2007-10-02 18:02 ` kirillkh
0 siblings, 2 replies; 17+ messages in thread
From: kirillkh @ 2007-10-02 16:15 UTC (permalink / raw)
To: Brian Hurt; +Cc: caml-list
[-- Attachment #1: Type: text/plain, Size: 1155 bytes --]
> This should be a FAQ.
>
Since we're talking of 10+ lines of code and only one case among many
possible (you might also want to do something fairly similar, but not quite
the same, as iterating over all words or characters in a file, doing
something else than counting, etc.), I would rather see it implemented in a
library as combinator. What I have in mind is a function that goes over a
file and invokes some user code on each block of
bytes/characters/lines/words/... The points of customization would be:
* how to detect the start and end of block
* routine to pass the blocks to
Then, on top of this combinator, build block-specific ones: for byte, char,
line, word blocks. Also make it possible to customize buffering behavior.
Being new to OCaml, I'm interested in comments - is what I suggest a good
idea? If yes, why hasn't anyone implemented it yet? One could argue that
it's trivial and can be implemented in each use case anew, but among 5
different solutions posted so far, each has its own problems, besides the
supposedly ideal 5-th -- I'd take this as an indication that, although
simple, it's not really trivial to write this thing.
[-- Attachment #2: Type: text/html, Size: 1469 bytes --]
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [Caml-list] best and fastest way to read lines from a file?
2007-10-02 16:15 ` kirillkh
@ 2007-10-02 17:10 ` verlyck, Bruno.Verlyck
2007-10-02 18:02 ` kirillkh
1 sibling, 0 replies; 17+ messages in thread
From: verlyck, Bruno.Verlyck @ 2007-10-02 17:10 UTC (permalink / raw)
To: kirillkh; +Cc: bhurt, caml-list
Date: Tue, 2 Oct 2007 18:15:57 +0200
From: kirillkh <kirillkh@gmail.com>
Hi,
> This should be a FAQ.
Since we're talking of 10+ lines of code and only one case among
many possible (you might also want to do something fairly similar,
but not quite the same, as iterating over all words or characters
in a file, doing something else than counting, etc.), I would
rather see it implemented in a library as combinator. What I have
in mind is a function that goes over a file and invokes some user
code on each block of bytes/characters/lines/words/... The points
of customization would be:
* how to detect the start and end of block
* routine to pass the blocks to
Then, on top of this combinator, build block-specific ones: for
byte, char, line, word blocks. Also make it possible to customize
buffering behavior.
Being new to OCaml, I'm interested in comments; is what I suggest a
good idea?
Yes, why not ?
If yes, why hasn't anyone implemented it yet?
I believe Cash (in the Hump:
http://caml.inria.fr/cgi-bin/hump.fr.cgi?contrib=86)
has some of the things you ask for: look around fold_in_channel (as a
combinator; yes, it *is* 5 lines of code), and for what you call
blocks, chapters 6 & 7 of the documentation (Reading delimited strings
& Record I/O and field parsing). Buffering is also parameterizable
(between 1 and 4Kb, no line buffering, sorry, too much C code to
modify in the Ocaml runtime).
It may not suit your taste, but when generalizing, everybody tends to
have one's own very specific idea of how to do it. Human nature...
At least Cash can give you some ideas.
Of course, the OP was asking for the fastest way... OK, we aren't
anymore.
HTH,
Bruno.
Disclaimer: Cash is still not ported to Ocaml 3.10; but 3.09 is fine.
Have to choose: camlp4 or 5... ?
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [Caml-list] best and fastest way to read lines from a file?
2007-10-02 16:15 ` kirillkh
2007-10-02 17:10 ` verlyck, Bruno.Verlyck
@ 2007-10-02 18:02 ` kirillkh
2007-10-02 19:35 ` skaller
2007-10-02 20:23 ` Olivier Andrieu
1 sibling, 2 replies; 17+ messages in thread
From: kirillkh @ 2007-10-02 18:02 UTC (permalink / raw)
Cc: caml-list
[-- Attachment #1: Type: text/plain, Size: 3083 bytes --]
Replying to a private mail from Brian:
2007/10/2, Brian Hurt <bhurt@janestcapital.com>:
>
> kirillkh wrote:
>
>
> This should be a FAQ.
> >
>
> Since we're talking of 10+ lines of code and only one case among many
> possible (you might also want to do something fairly similar, but not quite
> the same, as iterating over all words or characters in a file, doing
> something else than counting, etc.), I would rather see it implemented in a
> library as combinator. What I have in mind is a function that goes over a
> file and invokes some user code on each block of
> bytes/characters/lines/words/... The points of customization would be:
> * how to detect the start and end of block
> * routine to pass the blocks to
>
> Then, on top of this combinator, build block-specific ones: for byte,
> char, line, word blocks. Also make it possible to customize buffering
> behavior.
>
> Being new to OCaml, I'm interested in comments - is what I suggest a good
> idea? If yes, why hasn't anyone implemented it yet? One could argue that
> it's trivial and can be implemented in each use case anew, but among 5
> different solutions posted so far, each has its own problems, besides the
> supposedly ideal 5-th -- I'd take this as an indication that, although
> simple, it's not really trivial to write this thing.
>
> Note that the fifth version is only perfect because the author knew that
> he didn't need to keep the line around. Doing something even moderately
> more complicated would almost certainly require an allocation in the main
> loop- not that this is that big of a problem (Ocaml allocation is lightning
> fast).
>
> The problem with this library is when to stop. A real simple "fold over
> the lines of a file" interface might be nice, but there'll always be
> pressure to add just one more feature, deal with just one more slightly more
> complex case- at the end of which you get a badly specified and badly
> implemented parser combinator library.
>
> Brian
>
OK, so I'll give up the parsing/buffering part and only leave efficient
exception handling. This should leave the user free to do anything with it,
but prevent performance pitfalls. The following is based on Mattias
Engdegard's code:
(* I couldn't figure out, how to declare a polymorphic exception properly *)
exception Done of 'a
let fold_file (file: in_channel)
(read_func: in_channel->'a)
(elem_func: 'a->'b->'b)
(seed: 'b) =
let rec loop prev_val =
let input =
try read_func file
with End_of_file -> raise (Done prev_val)
in
let combined_val = elem_func input prev_val in
loop combined_val
in
try loop seed with Done x -> x
And the usage for line counting:
let line_count filename =
let f = open_in filename in
let counter _ count = count + 1 in
fold_file f readline counter 0
Since it's library code, we can tolerate the little annoyance of the second
try-catch. As far as I can tell, this code has the same performance
characteristics as yours: no consing + tail recursion. Any other problems
with it?
[-- Attachment #2: Type: text/html, Size: 4600 bytes --]
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [Caml-list] best and fastest way to read lines from a file?
2007-10-02 18:02 ` kirillkh
@ 2007-10-02 19:35 ` skaller
2007-10-02 21:05 ` kirillkh
2007-10-02 20:23 ` Olivier Andrieu
1 sibling, 1 reply; 17+ messages in thread
From: skaller @ 2007-10-02 19:35 UTC (permalink / raw)
To: kirillkh; +Cc: caml-list
On Tue, 2007-10-02 at 20:02 +0200, kirillkh wrote:
> Replying to a private mail from Brian:
> (* I couldn't figure out, how to declare a polymorphic exception
> properly *)
> exception Done of 'a
That's easy -- you can't: even if you could, how could
you possibly use it?
This compiles fine:
type t = { field : 'a. 'a }
exception Done of t
but 'field' is useless. This is not at all the same as
let f (x:'a) (g:'a -> int) =
match g x with
| 0 -> ..
| ..
because *inside* the function, 'a is not a type variable,
and the code is not polymorphic, it is simply a sole
unknown type, sometimes said to be monomorphised.
The problem with exceptions is that they're not captured,
so they cannot be polymorphic. Exceptions SUCK because
their context is not delimited -- you can throw all the way
out of the mainline .. :)
[This happens to me regularly and it can takes days to figure
out what is Not_found ..]
--
John Skaller <skaller at users dot sf dot net>
Felix, successor to C++: http://felix.sf.net
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [Caml-list] best and fastest way to read lines from a file?
2007-10-02 18:02 ` kirillkh
2007-10-02 19:35 ` skaller
@ 2007-10-02 20:23 ` Olivier Andrieu
2007-10-02 20:49 ` kirillkh
1 sibling, 1 reply; 17+ messages in thread
From: Olivier Andrieu @ 2007-10-02 20:23 UTC (permalink / raw)
To: kirillkh; +Cc: caml-list
On 10/2/07, kirillkh <kirillkh@gmail.com> wrote:
> OK, so I'll give up the parsing/buffering part and only leave efficient
> exception handling. This should leave the user free to do anything with it,
> but prevent performance pitfalls. The following is based on Mattias
> Engdegard's code:
>
> (* I couldn't figure out, how to declare a polymorphic exception properly *)
> exception Done of 'a
>
> let fold_file (file: in_channel)
> (read_func: in_channel->'a)
> (elem_func: 'a->'b->'b)
> (seed: 'b) =
> let rec loop prev_val =
> let input =
> try read_func file
> with End_of_file -> raise (Done prev_val)
> in
> let combined_val = elem_func input prev_val in
> loop combined_val
> in
> try loop seed with Done x -> x
>
> And the usage for line counting:
>
> let line_count filename =
> let f = open_in filename in
> let counter _ count = count + 1 in
> fold_file f readline counter 0
>
> Since it's library code, we can tolerate the little annoyance of the second
> try-catch. As far as I can tell, this code has the same performance
> characteristics as yours: no consing + tail recursion. Any other problems
> with it?
well apart from the fact that you cannot have "polymorphic exceptions"
in OCaml, this kind of code is IMHO much more natural with an
imperative loop instead of a functional one:
let fold_file read chan f init =
let acc = ref init in
begin
try while true do
let d = read chan in
acc := f d !acc
done
with End_of_file -> ()
end ;
!acc
--
Olivier
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [Caml-list] best and fastest way to read lines from a file?
2007-10-02 20:23 ` Olivier Andrieu
@ 2007-10-02 20:49 ` kirillkh
2007-10-02 21:10 ` Jon Harrop
2007-10-02 21:15 ` David Allsopp
0 siblings, 2 replies; 17+ messages in thread
From: kirillkh @ 2007-10-02 20:49 UTC (permalink / raw)
Cc: caml-list
[-- Attachment #1: Type: text/plain, Size: 2028 bytes --]
2007/10/2, Olivier Andrieu <oandrieu@gmail.com>:
>
> On 10/2/07, kirillkh <kirillkh@gmail.com> wrote:
> > OK, so I'll give up the parsing/buffering part and only leave efficient
> > exception handling. This should leave the user free to do anything with
> it,
> > but prevent performance pitfalls. The following is based on Mattias
> > Engdegard's code:
> >
> > (* I couldn't figure out, how to declare a polymorphic exception
> properly *)
> > exception Done of 'a
> >
> > let fold_file (file: in_channel)
> > (read_func: in_channel->'a)
> > (elem_func: 'a->'b->'b)
> > (seed: 'b) =
> > let rec loop prev_val =
> > let input =
> > try read_func file
> > with End_of_file -> raise (Done prev_val)
> > in
> > let combined_val = elem_func input prev_val in
> > loop combined_val
> > in
> > try loop seed with Done x -> x
> >
> > And the usage for line counting:
> >
> > let line_count filename =
> > let f = open_in filename in
> > let counter _ count = count + 1 in
> > fold_file f readline counter 0
> >
> > Since it's library code, we can tolerate the little annoyance of the
> second
> > try-catch. As far as I can tell, this code has the same performance
> > characteristics as yours: no consing + tail recursion. Any other
> problems
> > with it?
>
> well apart from the fact that you cannot have "polymorphic exceptions"
> in OCaml, this kind of code is IMHO much more natural with an
> imperative loop instead of a functional one:
>
>
> let fold_file read chan f init =
> let acc = ref init in
> begin
> try while true do
> let d = read chan in
> acc := f d !acc
> done
> with End_of_file -> ()
> end ;
> !acc
>
> --
> Olivier
>
A little weird to see such inherently functional construct as fold
implemented imperatively. But it's fine with me, as long as it does the job.
I wonder, though, how would the performance of a line counter based on your
code compare with the one suggested by Brian.
[-- Attachment #2: Type: text/html, Size: 3191 bytes --]
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [Caml-list] best and fastest way to read lines from a file?
2007-10-02 19:35 ` skaller
@ 2007-10-02 21:05 ` kirillkh
2007-10-02 21:07 ` Jon Harrop
0 siblings, 1 reply; 17+ messages in thread
From: kirillkh @ 2007-10-02 21:05 UTC (permalink / raw)
To: skaller; +Cc: caml-list
[-- Attachment #1: Type: text/plain, Size: 1384 bytes --]
2007/10/2, skaller <skaller@users.sourceforge.net>:
>
> On Tue, 2007-10-02 at 20:02 +0200, kirillkh wrote:
> > Replying to a private mail from Brian:
>
> > (* I couldn't figure out, how to declare a polymorphic exception
> > properly *)
> > exception Done of 'a
>
> That's easy -- you can't: even if you could, how could
> you possibly use it?
>
> This compiles fine:
>
> type t = { field : 'a. 'a }
> exception Done of t
>
> but 'field' is useless. This is not at all the same as
>
> let f (x:'a) (g:'a -> int) =
> match g x with
> | 0 -> ..
> | ..
>
> because *inside* the function, 'a is not a type variable,
> and the code is not polymorphic, it is simply a sole
> unknown type, sometimes said to be monomorphised.
>
> The problem with exceptions is that they're not captured,
> so they cannot be polymorphic. Exceptions SUCK because
> their context is not delimited -- you can throw all the way
> out of the mainline .. :)
>
> [This happens to me regularly and it can takes days to figure
> out what is Not_found ..]
Is there a way to instantiate an exception with a value of unspecified type
and then do explicit casting on catch?
Is it a deficiency in the language? I suppose OCaml could support
polymorphic exceptions, if they were checked, like in Java, and appeared in
function signatures in a similar way to parameters and return values.
[-- Attachment #2: Type: text/html, Size: 1922 bytes --]
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [Caml-list] best and fastest way to read lines from a file?
2007-10-02 21:05 ` kirillkh
@ 2007-10-02 21:07 ` Jon Harrop
0 siblings, 0 replies; 17+ messages in thread
From: Jon Harrop @ 2007-10-02 21:07 UTC (permalink / raw)
To: caml-list
On Tuesday 02 October 2007 22:05:07 kirillkh wrote:
> Is it a deficiency in the language?
A constraint of the type system by design.
> I suppose OCaml could support
> polymorphic exceptions, if they were checked, like in Java, and appeared in
> function signatures in a similar way to parameters and return values.
The exn type would need an undefined number of type parameters.
--
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/products/?e
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [Caml-list] best and fastest way to read lines from a file?
2007-10-02 20:49 ` kirillkh
@ 2007-10-02 21:10 ` Jon Harrop
2007-10-02 21:15 ` David Allsopp
1 sibling, 0 replies; 17+ messages in thread
From: Jon Harrop @ 2007-10-02 21:10 UTC (permalink / raw)
To: caml-list
On Tuesday 02 October 2007 21:49:41 kirillkh wrote:
> A little weird to see such inherently functional construct as fold
> implemented imperatively.
On the contrary, this is the core functionality of ML that makes it so
practically useful: combining functional and imperative programming neatly
and efficiently.
Look at the Array and Hashtbl modules of the stdlib for more examples of code
that is both imperative and functional.
--
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/products/?e
^ permalink raw reply [flat|nested] 17+ messages in thread
* RE: [Caml-list] best and fastest way to read lines from a file?
2007-10-02 20:49 ` kirillkh
2007-10-02 21:10 ` Jon Harrop
@ 2007-10-02 21:15 ` David Allsopp
2007-10-02 22:23 ` skaller
1 sibling, 1 reply; 17+ messages in thread
From: David Allsopp @ 2007-10-02 21:15 UTC (permalink / raw)
To: 'kirillkh'; +Cc: caml-list
> A little weird to see such inherently functional construct as fold
implemented imperatively. But it's fine with me, as long as it does the job.
I
> wonder, though, how would the performance of a line counter based on your
code compare with the one suggested by Brian.
It's faster, though only slightly - I was going to post an imperative
solution earlier, too. Even in Ocaml, a recursive call (tail or otherwise)
has some costs not present in a simple loop. However, I don't agree that
it's more natural - "ugly" is more the word I'd use!
David
^ permalink raw reply [flat|nested] 17+ messages in thread
* RE: [Caml-list] best and fastest way to read lines from a file?
2007-10-02 21:15 ` David Allsopp
@ 2007-10-02 22:23 ` skaller
0 siblings, 0 replies; 17+ messages in thread
From: skaller @ 2007-10-02 22:23 UTC (permalink / raw)
To: David Allsopp; +Cc: 'kirillkh', caml-list
On Tue, 2007-10-02 at 22:15 +0100, David Allsopp wrote:
> > A little weird to see such inherently functional construct as fold
> implemented imperatively. But it's fine with me, as long as it does the job.
> I
> > wonder, though, how would the performance of a line counter based on your
> code compare with the one suggested by Brian.
>
> It's faster, though only slightly - I was going to post an imperative
> solution earlier, too. Even in Ocaml, a recursive call (tail or otherwise)
> has some costs not present in a simple loop. However, I don't agree that
> it's more natural - "ugly" is more the word I'd use!
In Felix, tail-rec calls are generally faster than loops.
The reason is that the compiler is better able to reason
about the semantics and so it can do better optimisations.
OTOH loops degenerate to conditional goto-spaghetti and
would require SSA analysis to recover.
For example, given:
fun f(x,y) ... f (a,b)
the implementation generates
var x,y;
set_initial_values(x,y);
start:
...
paropt x,y = a,b;
goto start;
where 'paropt' serialises the parallel assignment
using an optimisation algorithm which tries to minimise
the number of assignments and temporaries used.
this has a significant impact on performance .. Felix version
of Ackermann's function is more then 2.5x faster than Ocamlopt,
and almost 2x faster than C. Ackermann has 2 tail-rec calls.
Trying to do this with loops is much harder because, unlike
the tail-rec formulation, the 'inputs' and 'outputs' connected
by the loops are not explicit, as they are when a function
is used and parameters and arguments define the input/output
relationship (and everything else local is a temporary with no
persistence: Felix doesn't not allow functions to have side
effects, so only local data is mutable).
--
John Skaller <skaller at users dot sf dot net>
Felix, successor to C++: http://felix.sf.net
^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2007-10-02 22:23 UTC | newest]
Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-10-01 21:27 best and fastest way to read lines from a file? YC
2007-10-01 21:55 ` [Caml-list] " Daniel Bünzli
2007-10-01 22:29 ` YC
2007-10-01 21:55 ` Olivier Roussel
2007-10-02 12:39 ` Mattias Engdegård
2007-10-02 12:56 ` Brian Hurt
2007-10-02 16:15 ` kirillkh
2007-10-02 17:10 ` verlyck, Bruno.Verlyck
2007-10-02 18:02 ` kirillkh
2007-10-02 19:35 ` skaller
2007-10-02 21:05 ` kirillkh
2007-10-02 21:07 ` Jon Harrop
2007-10-02 20:23 ` Olivier Andrieu
2007-10-02 20:49 ` kirillkh
2007-10-02 21:10 ` Jon Harrop
2007-10-02 21:15 ` David Allsopp
2007-10-02 22:23 ` skaller
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox