From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on yquem.inria.fr X-Spam-Level: X-Spam-Status: No, score=0.0 required=5.0 tests=AWL autolearn=disabled version=3.1.3 Received: from mail4-relais-sop.national.inria.fr (mail4-relais-sop.national.inria.fr [192.134.164.105]) by yquem.inria.fr (Postfix) with ESMTP id BBB23BC6D for ; Tue, 25 Sep 2007 00:06:28 +0200 (CEST) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AgAAAKfR90bAXQImh2dsb2JhbACOGwIBCAon X-IronPort-AV: E=Sophos;i="4.20,292,1186351200"; d="scan'208";a="16718369" Received: from discorde.inria.fr ([192.93.2.38]) by mail4-smtp-sop.national.inria.fr with ESMTP; 25 Sep 2007 00:06:27 +0200 Received: from mail2-relais-roc.national.inria.fr (mail2-relais-roc.national.inria.fr [192.134.164.83]) by discorde.inria.fr (8.13.6/8.13.6) with ESMTP id l8OM6OZu005823 (version=TLSv1/SSLv3 cipher=RC4-SHA bits=128 verify=OK) for ; Tue, 25 Sep 2007 00:06:27 +0200 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AgAAAKfR90bLENaMnmdsb2JhbACOGwIBAQcEBg8Y X-IronPort-AV: E=Sophos;i="4.20,292,1186351200"; d="scan'208";a="1706078" Received: from ipmail01.adl2.internode.on.net ([203.16.214.140]) by mail2-smtp-roc.national.inria.fr with ESMTP; 25 Sep 2007 00:06:23 +0200 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AgAAAKfR90Z5LHvc/2dsb2JhbAAM X-IronPort-AV: E=Sophos;i="4.20,292,1186324200"; d="scan'208";a="196897727" Received: from ppp121-44-123-220.lns10.syd6.internode.on.net (HELO [192.168.1.201]) ([121.44.123.220]) by ipmail01.adl2.internode.on.net with ESMTP; 25 Sep 2007 07:36:20 +0930 Subject: Re: ocamllex speed [was Re: [Caml-list] mboxlib reloaded ;-)] From: skaller To: Bruno De Fraine Cc: Caml-list ml , Oliver Bandel In-Reply-To: References: <20070427135425.GA1161@first.in-berlin.de> <20070427162911.GA10099@furbychan.cocan.org> <20070427231220.GA1507@first.in-berlin.de> Content-Type: text/plain Date: Tue, 25 Sep 2007 08:06:19 +1000 Message-Id: <1190671579.9127.18.camel@rosella.wigram> Mime-Version: 1.0 X-Mailer: Evolution 2.10.1 Content-Transfer-Encoding: 7bit X-Miltered: at discorde with ID 46F834E0.001 by Joe's j-chkmail (http://j-chkmail . ensmp . fr)! X-Spam: no; 0.00; ocamllex:01 0200,:01 endline:01 getcwd:01 lexbuf:01 lexbuf:01 argv:01 lexing:01 argv:01 avoided:01 lexer:01 lexeme:01 sourceforge:01 wrote:01 rewrite:01 On Mon, 2007-09-24 at 20:22 +0200, Bruno De Fraine wrote: > { } > rule translate = parse > | "current_directory" { print_endline (Sys.getcwd ()); translate > lexbuf } > | _ { translate lexbuf } > | eof { () } > { > for i = 1 to (Array.length Sys.argv - 1); do > translate (Lexing.from_channel (open_in Sys.argv.(i))) > done ;; > } > > What causes this difference? Perhaps there > is a high overhead in calling the translate function for every input > character in such big input files, but I don't know how this can be > avoided. Easily. Rewrite the lexer to | [a-zA-z_]+ { if lexeme = "current_directory" ... } | _ { translate lexbuf } Then it will eat whole words instead of characters, which should make it about 5 times faster. -- John Skaller Felix, successor to C++: http://felix.sf.net