From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from concorde.inria.fr (concorde.inria.fr [192.93.2.39]) by yquem.inria.fr (Postfix) with ESMTP id 77804BC8C for ; Wed, 12 Jan 2005 08:59:34 +0100 (CET) Received: from pauillac.inria.fr (pauillac.inria.fr [128.93.11.35]) by concorde.inria.fr (8.13.0/8.13.0) with ESMTP id j0C7xX28020637 for ; Wed, 12 Jan 2005 08:59:34 +0100 Received: from concorde.inria.fr (concorde.inria.fr [192.93.2.39]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id IAA09492 for ; Wed, 12 Jan 2005 08:59:33 +0100 (MET) Received: from smtp1.adl2.internode.on.net (smtp1.adl2.internode.on.net [203.16.214.181]) by concorde.inria.fr (8.13.0/8.13.0) with ESMTP id j0C7xVfM020631; Wed, 12 Jan 2005 08:59:32 +0100 Received: from [192.168.1.200] (ppp201-179.lns1.syd3.internode.on.net [203.122.201.179]) by smtp1.adl2.internode.on.net (8.12.9/8.12.9) with ESMTP id j0C7xO6R085672; Wed, 12 Jan 2005 18:29:25 +1030 (CST) Subject: Re: [Caml-list] Thread safe Str From: skaller Reply-To: skaller@users.sourceforge.net To: Martin Jambon Cc: Xavier Leroy , Christophe TROESTLER , "O'Caml Mailing List" In-Reply-To: References: Content-Type: text/plain Message-Id: <1105516763.2574.231.camel@pelican.wigram> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 (1.2.2-4) Date: 12 Jan 2005 18:59:23 +1100 Content-Transfer-Encoding: 7bit X-Miltered: at concorde with ID 41E4D8E5.000 by Joe's j-chkmail (http://j-chkmail.ensmp.fr)! X-Miltered: at concorde with ID 41E4D8E3.000 by Joe's j-chkmail (http://j-chkmail.ensmp.fr)! X-Spam: no; 0.00; caml-list:01 sourceforge:01 wrote:01 wrote:01 model:01 pcre:01 endmatch:01 backtracking:01 dfa:01 dfa:01 token:01 rhs:01 glebe:01 ...:98 061:98 X-Spam-Checker-Version: SpamAssassin 3.0.0 (2004-09-13) on yquem.inria.fr X-Spam-Status: No, score=0.0 required=5.0 tests=none autolearn=disabled version=3.0.0 X-Spam-Level: On Wed, 2005-01-12 at 07:53, Martin Jambon wrote: > On 11 Jan 2005, skaller wrote: > > See the ulex package for a model. The problem with micmatch > > is precisely that it does use Str. > > It uses either Str or PCRE. > You can also integrate ulex by yourself if it is possible. I'm not sure .. ulex doesn't use NFA's does it? AFIAK it doesn't support captures. The problem with micmatch is that it probably doesn't do what can be done with a system supported in the core language, linear matching over all cases. Felix does do it: in this code regmatch .. with | re1 => .. | re2 => .. | re3 =>.. endmatch the input characters (from ...) are read exactly once, not only is there no backtracking, since a DFA is used, but it isn't a sequence of comparisons (it's a DFA based tokeniser, the token selecting the RHS expression). I would guess micmatch does not support that, although it probably could. -- John Skaller, mailto:skaller@users.sf.net voice: 061-2-9660-0850, snail: PO BOX 401 Glebe NSW 2037 Australia Checkout the Felix programming language http://felix.sf.net