Mailing list for all users of the OCaml language and system.
 help / color / mirror / Atom feed
From: Bruno De Fraine <Bruno.De.Fraine@vub.ac.be>
To: Caml-list ml <caml-list@inria.fr>
Cc: Oliver Bandel <oliver@first.in-berlin.de>
Subject: ocamllex speed [was Re: [Caml-list] mboxlib reloaded ;-)]
Date: Mon, 24 Sep 2007 20:22:00 +0200	[thread overview]
Message-ID: <EFC8B570-0C5E-44D6-88CF-4CC04FA2CAFA@vub.ac.be> (raw)
In-Reply-To: <20070427231220.GA1507@first.in-berlin.de>

Hello,

On 28 Apr 2007, at 01:12, Oliver Bandel wrote:
> So, I then checked my mboxlib and saw that it is quite slow,
> compared to what I expected ( expect! I did not tried it
> on my development machine because I have nomutt installed there)
> and even if native-code smuch faster, it's nevertheless slow...
> ...so I thought I have to redesign my scanner-stage.
> (I use Str-module and ocamnllex mixed together; maybe
>  using a plain selfwritten  OCaml-scanner might be better here).

I don't know if Oliver ever got to the bottom of this speed problem,  
but, I also noticed ocamllex can be quite slow for simple scanning.  
For example, I used this ocamllex source:

{ }
rule translate = parse
| "current_directory" { print_endline (Sys.getcwd ()); translate  
lexbuf }
| _ { translate lexbuf }
| eof { () }
{
     for i = 1 to (Array.length Sys.argv - 1); do
         translate (Lexing.from_channel (open_in Sys.argv.(i)))
     done ;;
}

And compared it against this version using the Str module:

let re = Str.regexp_string "current_directory" ;;
for i = 1 to (Array.length Sys.argv - 1); do
     let ch = open_in Sys.argv.(i) in
     try
         while true; do
             let line = input_line ch in
             try
                 let _ = Str.search_forward re line 0 in
                 print_endline (Sys.getcwd ())
             with Not_found -> ()
         done
     with End_of_file -> close_in ch
done ;;

Neither version does anything useful, except print the current  
directory when it encounters the string "current_directory". I tested  
this on a 57M text file (that has only a few "current_directory"  
occurrences). The ocamllex-version takes about 3.5s, while the Str- 
version takes only 0.35s. What causes this difference? Perhaps there  
is a high overhead in calling the translate function for every input  
character in such big input files, but I don't know how this can be  
avoided.

Thanks,
Bruno


  parent reply	other threads:[~2007-09-24 18:22 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-04-27 13:54 mboxlib reloaded ;-) Oliver Bandel
2007-04-27 16:29 ` [Caml-list] " Richard Jones
2007-04-27 23:12   ` Oliver Bandel
2007-04-28  0:54     ` skaller
2007-04-28 10:47       ` Oliver Bandel
2007-04-28 10:54         ` Gabriel Kerneis
2007-04-28 11:44           ` Oliver Bandel
2007-04-28 13:49             ` skaller
2007-04-28 14:18               ` Oliver Bandel
2007-04-29 10:45                 ` Richard Jones
2007-04-29 15:41                   ` Oliver Bandel
2007-04-29 18:51                     ` Robert Roessler
2007-05-01 11:00                       ` camomile-problem (Re: [Caml-list] mboxlib reloaded ;-)) Oliver Bandel
2007-05-01 10:56                   ` [Caml-list] mboxlib reloaded ;-) Oliver Bandel
2007-04-28  7:56     ` Richard Jones
2007-04-28 10:58       ` Oliver Bandel
     [not found]         ` <20070429103911.GA30510@furbychan.cocan.org>
2007-04-29 15:43           ` Oliver Bandel
2007-09-24 18:22     ` Bruno De Fraine [this message]
2007-09-24 19:54       ` ocamllex speed [was Re: [Caml-list] mboxlib reloaded ;-)] Alain Frisch
2007-09-25  8:53         ` Bruno De Fraine
2007-09-24 22:06       ` skaller
2007-09-27  5:26       ` Chris King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=EFC8B570-0C5E-44D6-88CF-4CC04FA2CAFA@vub.ac.be \
    --to=bruno.de.fraine@vub.ac.be \
    --cc=caml-list@inria.fr \
    --cc=oliver@first.in-berlin.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox