From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on yquem.inria.fr X-Spam-Level: X-Spam-Status: No, score=0.0 required=5.0 tests=none autolearn=disabled version=3.1.3 Received: from concorde.inria.fr (concorde.inria.fr [192.93.2.39]) by yquem.inria.fr (Postfix) with ESMTP id B6DE9BC6B for ; Sat, 28 Apr 2007 02:54:12 +0200 (CEST) Received: from ipmail01.adl2.internode.on.net (ipmail01.adl2.internode.on.net [203.16.214.140]) by concorde.inria.fr (8.13.6/8.13.6) with ESMTP id l3S0sAHp003241 for ; Sat, 28 Apr 2007 02:54:11 +0200 X-IronPort-AV: E=Sophos;i="4.14,463,1170595800"; d="scan'208";a="120451587" Received: from ppp8-148.lns1.syd7.internode.on.net (HELO [192.168.1.201]) ([59.167.8.148]) by ipmail01.adl2.internode.on.net with ESMTP; 28 Apr 2007 10:24:08 +0930 Subject: Re: [Caml-list] mboxlib reloaded ;-) From: skaller To: Oliver Bandel Cc: caml-list@inria.fr In-Reply-To: <20070427231220.GA1507@first.in-berlin.de> References: <20070427135425.GA1161@first.in-berlin.de> <20070427162911.GA10099@furbychan.cocan.org> <20070427231220.GA1507@first.in-berlin.de> Content-Type: text/plain Date: Sat, 28 Apr 2007 10:54:06 +1000 Message-Id: <1177721646.16582.8.camel@rosella.wigram> Mime-Version: 1.0 X-Mailer: Evolution 2.10.1 Content-Transfer-Encoding: 7bit X-Miltered: at concorde with ID 46329B32.000 by Joe's j-chkmail (http://j-chkmail . ensmp . fr)! X-Spam: no; 0.00; reloaded:01 0200,:01 bandel:01 native-code:01 ocamllex:01 ocamllex:01 ocaml:01 lexer:01 sourceforge:01 wrote:01 oliver:01 caml-list:01 finite:02 tagged:03 let:03 On Sat, 2007-04-28 at 01:12 +0200, Oliver Bandel wrote: > So, I then checked my mboxlib and saw that it is quite slow, > compared to what I expected ( expect! I did not tried it > on my development machine because I have nomutt installed there) > and even if native-code smuch faster, it's nevertheless slow... > ...so I thought I have to redesign my scanner-stage. > (I use Str-module and ocamnllex mixed together; maybe > using a plain selfwritten OCaml-scanner might be better here). Ocamllex generates very fast scanner: it is using a very high-tech tagged deterministic finite state automaton with a driver written in C (so no boxing etc processing text buffers). I doubt you can hand code anything as fast as Ocamllex in C, let alone in Ocaml. You should check the size (number of states) of the generated lexer. It will run faster with small number of states where the matrix fits easily in the cache. -- John Skaller Felix, successor to C++: http://felix.sf.net