From: Brian Hurt <bhurt@spnz.org>
To: "André Luiz Moura" <aluizmoura@yahoo.com.br>
Cc: caml-list@inria.fr
Subject: Re: [Caml-list] Reading a large text file
Date: Sat, 1 May 2004 09:03:56 -0500 (CDT) [thread overview]
Message-ID: <Pine.LNX.4.44.0405010847480.9460-100000@localhost.localdomain> (raw)
In-Reply-To: <20040428152813.47355.qmail@web60608.mail.yahoo.com>
I'm still digging through a serious backlog of email, so your question may
have already been answered. If so, ignore this.
But what I'm guessing is happening is that you are appending (adding to
the end of) your list, and that this is what is killing you. To add an
element to the *end* of a list, Ocaml has to completely reallocate the
entire list- turning what you might think is an O(1) operation into an
O(n) operation.
The solution to this is to build the list backwards- instead of adding
things to the end of the list, add them to the begining. Then, when
you're done, just reverse the list using List.rev.
Lets look at the example of just reading the lines from a file and making
a list of them. This code is bad:
let readfile chan =
let rec loop lst =
try
loop (lst @ [ (input_line chan) ])
with
| End_of_file -> lst
in
loop []
;;
It works, but to read n lines requires O(n^2) work, because the @ has to
reallocate the entire list every iteration. Worse yet it isn't tail
recursive (a recursive call inside a try/catch isn't a tail call).
A better implementation of this would be:
let readfile chan =
let rec loop rlst =
let line, eof =
try
(input_line chan), false
with
| End_of_file -> "", true
in
if not eof then
loop (line :: rlst)
else
List.rev rlst
in
loop []
;;
Now, the first thing to notice is that we're using the :: operator (which
is O(1)), not the @ operator (which is O(n))- we're prepending things onto
the list, not appending them. We're building up the list backwards, and
then, when we hit the end of the file, reversing the entire list. This is
a standard pattern in Ocaml.
The second thing to notice is that the recursive call has been hoisted up
out of the try/catch block. We've introduced a new boolean flag to note
when we've hit the end of the file- catching the exception simply sets the
flag to true. So now our function is tail recursive, and operates in
constant stack space.
Brian
On Wed, 28 Apr 2004, André Luiz Moura wrote:
> Hi List,
>
> I wrote an OCaml function to read a text file with a million of lines with five separate columns for spaces. I based myself on previous messages of Caml-list, but however the work is taking time frightful (many minutes).
> This function written in C, for example, does not take more than 4 seconds.
>
> I am executing the program in a PC Pentium III of 128 MB of RAM under Windows platform. How would be implemented the function to decide this problem?
>
> File format:
> <vertex #> <x> <y> [attributes] [boundary marker]
>
> Thanks,
>
> André
>
>
>
>
>
>
> ---------------------------------
> Yahoo! Messenger - Fale com seus amigos online. Instale agora!
--
"Usenet is like a herd of performing elephants with diarrhea -- massive,
difficult to redirect, awe-inspiring, entertaining, and a source of
mind-boggling amounts of excrement when you least expect it."
- Gene Spafford
Brian
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
next prev parent reply other threads:[~2004-05-01 13:58 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2004-04-28 15:28 André Luiz Moura
2004-04-28 16:28 ` Richard Jones
2004-05-01 14:03 ` Brian Hurt [this message]
2004-05-01 15:43 ` Rahul Siddharthan
2004-05-01 16:00 ` [Caml-list] [OcamlDoc] langage support sejourne kevin
2004-05-14 7:15 ` Maxence Guesdon
2004-05-01 18:05 ` [Caml-list] Reading a large text file Richard Jones
2004-05-01 18:25 ` Charles Forsyth
2004-05-01 19:25 ` skaller
2004-05-01 19:51 ` Alain.Frisch
2004-05-01 20:40 ` skaller
2004-05-01 21:11 ` [Caml-list] Private types skaller
2004-05-01 21:33 ` [Caml-list] Reading a large text file Alain.Frisch
2004-05-17 5:28 ` Eric Stokes
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Pine.LNX.4.44.0405010847480.9460-100000@localhost.localdomain \
--to=bhurt@spnz.org \
--cc=aluizmoura@yahoo.com.br \
--cc=caml-list@inria.fr \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox