From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from concorde.inria.fr (concorde.inria.fr [192.93.2.39]) by yquem.inria.fr (Postfix) with ESMTP id 6C4B4BB9C for ; Sun, 4 Dec 2005 16:19:40 +0100 (CET) Received: from cenn.mc.mpls.visi.com (cenn.mc.mpls.visi.com [208.42.156.9]) by concorde.inria.fr (8.13.0/8.13.0) with ESMTP id jB4FJdmD019581 for ; Sun, 4 Dec 2005 16:19:40 +0100 Received: from [192.168.42.2] (bhurt.dsl.visi.com [208.42.141.66]) by cenn.mc.mpls.visi.com (Postfix) with ESMTP id 0728E8A86; Sun, 4 Dec 2005 09:19:39 -0600 (CST) Date: Sun, 4 Dec 2005 09:25:46 -0600 (CST) From: Brian Hurt X-X-Sender: bhurt@localhost.localdomain To: skaller Cc: caml-list@yquem.inria.fr Subject: Re: [Caml-list] yacc question In-Reply-To: <1133705740.11050.14.camel@rosella> Message-ID: References: <1133705740.11050.14.camel@rosella> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Miltered: at concorde with ID 4393090B.000 by Joe's j-chkmail (http://j-chkmail.ensmp.fr)! X-Spam: no; 0.00; caml-list:01 parser:01 expr:01 buffer:01 expr:01 token:01 trailing:01 token:01 buffer:01 semicolon:01 semicolon:01 wrote:01 parsed:01 terminates:01 precedence:01 X-Spam-Checker-Version: SpamAssassin 3.0.3 (2005-04-27) on yquem.inria.fr X-Spam-Level: X-Spam-Status: No, score=0.0 required=5.0 tests=none autolearn=disabled version=3.0.3 On Mon, 5 Dec 2005, skaller wrote: > I have the 'usual' kind of parser for expressions, with two > nonterminals 'expr' and 'atom', the latter including ( expr ) > and INTEGER of course. > > When I have input like > > 1; > 1 + 2 ; > (1 + 2) ; > > none of the case parse as expressions, the first and > last do parse as atoms (leaving the ; in the buffer). > > What I want is to parse the longest possible match. > The only way I can think of doing this is: > > top_expr: > | expr token_not_allowed_in_expressions > > and then 'put back' the trailing token into the buffer. > Is there another way? > The standard way to implement this is: statement: expression SEMICOLON expression: | mul_expression | expression PLUS mul_expression | expression MINUS mul_expression mul_expression: atom_expression | mul_expression TIMES atom_expression | mul_expression DIVIDE atom_expression | mul_expression MODULO atom_expression atom_expression: ATOM | OPEN_PAREN expression CLOSE_PAREN Notice that precedence is taken care of immediately- 3 + 4 * 5 is parsed as 3 + (4 * 5). Also notice that the semicolon is what terminates us- we keep collecting up an expression until we see a semicolon- only then do we complete the statement. Hope this helps. Brian