From: Jeff Henrikson <jehenrik@yahoo.com>
To: Sam Steingold <sds@podval.org>
Cc: caml-list@inria.fr
Subject: Re: [Caml-list] zcat vs CamlZip
Date: Tue, 29 Aug 2006 23:12:16 -0700 [thread overview]
Message-ID: <44F52C40.3070702@yahoo.com> (raw)
In-Reply-To: <44F48A17.5080005@podval.org>
I was planning on using the library "ocaml gz" in my application, which
is a binding to zlib. I haven't done any detailed benchmarking, but I
presume its speed is comparable to gzip/gunzip since they just call out
to zlib.
http://ocamlplot.sourceforge.net/
Jeff Henrikson
Sam Steingold wrote:
> I read through a huge *.gz file.
> I have two versions of the code:
>
> 1. use Unix.open_process_in "zcat foo.gz".
>
> 2. use gzip.mli (1.2 2002/02/18) as comes with godi 3.09.
>
> it turns out that the zcat version is 3(!) times as fast as the
> gzip.mli one:
>
> Run time: 189.435840 sec
> Self: 189.435840 sec
> sys: 183.447465 sec
> user: 5.988375 sec
> Children: 0.000000 sec
> sys: 0.000000 sec
> user: 0.000000 sec
> GC: minor: 169778
> major: 478
> compactions: 3
> Allocated: 5510457762.0 words
> Wall clock: 206 sec (00:03:26)
>
> vs
>
> Run time: 58.471655 sec
> Self: 54.855429 sec
> sys: 48.527033 sec
> user: 6.328396 sec
> Children: 3.616226 sec
> sys: 3.168198 sec
> user: 0.448028 sec
> GC: minor: 43174
> major: 229
> compactions: 5
> Allocated: 1401290543.0 words
> Wall clock: 78 sec (00:01:18)
>
> since gzip.mli lacks input_line function, I had to roll my own:
>
> let buf = Buffer.create 1024
> let gz_input_line gz_in char_counter line_counter =
> Buffer.clear buf;
> let finish () = incr line_counter; Buffer.contents buf in
> let rec loop () =
> let ch = Gzip.input_char gz_in in
> char_counter := Int64.succ !char_counter;
> if ch = '\n' then finish () else ( Buffer.add_char buf ch; loop
> (); ) in
> try loop ()
> with End_of_file ->
> if Buffer.length buf = 0 then raise End_of_file else finish ()
>
> is there something wrong with my gz_input_line?
> is this a know performance issue with the CamlZip library?
>
> thanks.
> Sam.
>
> _______________________________________________
> Caml-list mailing list. Subscription management:
> http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
> Archives: http://caml.inria.fr
> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
> Bug reports: http://caml.inria.fr/bin/caml-bugs
prev parent reply other threads:[~2006-08-30 6:03 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-08-29 18:40 Sam Steingold
2006-08-29 18:54 ` Bardur Arantsson
2006-08-29 19:01 ` [Caml-list] " Florian Hars
2006-08-29 19:15 ` Sam Steingold
2006-08-29 19:48 ` Bárður Árantsson
2006-08-29 19:54 ` [Caml-list] " Gerd Stolpmann
2006-08-29 20:04 ` Gerd Stolpmann
2006-08-30 0:44 ` malc
2006-08-30 0:53 ` Jonathan Roewen
2006-08-29 19:37 ` John Carr
2006-08-29 19:11 ` [Caml-list] " Eric Cooper
2006-08-30 6:12 ` Jeff Henrikson [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=44F52C40.3070702@yahoo.com \
--to=jehenrik@yahoo.com \
--cc=caml-list@inria.fr \
--cc=sds@podval.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox