From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail3-relais-sop.national.inria.fr (mail3-relais-sop.national.inria.fr [192.134.164.104]) by sympa.inria.fr (Postfix) with ESMTPS id 5905F7FB38 for ; Fri, 26 Dec 2014 12:33:23 +0100 (CET) Received-SPF: None (mail3-smtp-sop.national.inria.fr: no sender authenticity information available from domain of frederic.bour@lakaban.net) identity=pra; client-ip=213.251.185.180; receiver=mail3-smtp-sop.national.inria.fr; envelope-from="frederic.bour@lakaban.net"; x-sender="frederic.bour@lakaban.net"; x-conformance=sidf_compatible Received-SPF: Pass (mail3-smtp-sop.national.inria.fr: domain of frederic.bour@lakaban.net designates 213.251.185.180 as permitted sender) identity=mailfrom; client-ip=213.251.185.180; receiver=mail3-smtp-sop.national.inria.fr; envelope-from="frederic.bour@lakaban.net"; x-sender="frederic.bour@lakaban.net"; x-conformance=sidf_compatible; x-record-type="v=spf1" Received-SPF: None (mail3-smtp-sop.national.inria.fr: no sender authenticity information available from domain of postmaster@mail.lakaban.net) identity=helo; client-ip=213.251.185.180; receiver=mail3-smtp-sop.national.inria.fr; envelope-from="frederic.bour@lakaban.net"; x-sender="postmaster@mail.lakaban.net"; x-conformance=sidf_compatible X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AiMFAE9GnVTV+7m0/2dsb2JhbABchDCDBclZAoEgAQEBAQF9hA0BBSMPAQUIAQE2Ag8LGAICBRYLAgIJAwIBAgFFEwgBAYgsswZwhGIBBZApAQsBGQaBIYhsgjqDNxaCUoFBnRGDVYd3IoIPgWBugkMBAQE X-IPAS-Result: AiMFAE9GnVTV+7m0/2dsb2JhbABchDCDBclZAoEgAQEBAQF9hA0BBSMPAQUIAQE2Ag8LGAICBRYLAgIJAwIBAgFFEwgBAYgsswZwhGIBBZApAQsBGQaBIYhsgjqDNxaCUoFBnRGDVYd3IoIPgWBugkMBAQE X-IronPort-AV: E=Sophos;i="5.07,647,1413237600"; d="scan'208";a="94746845" Received: from pepper.lakaban.net (HELO mail.lakaban.net) ([213.251.185.180]) by mail3-smtp-sop.national.inria.fr with ESMTP; 26 Dec 2014 12:33:22 +0100 Received: from [192.168.1.34] (88.Red-88-19-55.staticIP.rima-tde.net [88.19.55.88]) (Authenticated sender: defre@ygg-drasil.fr) by mail.lakaban.net (Postfix) with ESMTPSA id 2E4B032D161 for ; Fri, 26 Dec 2014 11:32:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=lakaban.net; s=default; t=1419593537; bh=d4eeOetKkmilcbH6Xxv/bAv+fHzdJYzEsOimimLvzdA=; h=Date:From:To:Subject:References:In-Reply-To:From; b=Kx96UmIM689mnzN7SkoJM73dANE14Mr1Q3NMUYawg1eNW6HxMqM6W7zVI+utiqY5Y nlCordV8a9FQ12sabaGCSJ6VUJP5nDe7+FojA1KRcEelRgU20xyb7p1U08apWLjmmh Lp0jq2UtgAJjk7WqzGDNbnavF6NoDBVcg2FYiR+k= Message-ID: <549D4725.9030202@lakaban.net> Date: Fri, 26 Dec 2014 12:31:49 +0100 From: =?UTF-8?B?RnLDqWTDqXJpYyBCb3Vy?= User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.3.0 MIME-Version: 1.0 To: caml-list@inria.fr References: <20141224233002.GE7225@yquem.inria.fr> <1100699448.864923.1419592410621.JavaMail.yahoo@jws100188.mail.ne1.yahoo.com> In-Reply-To: <1100699448.864923.1419592410621.JavaMail.yahoo@jws100188.mail.ne1.yahoo.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Subject: Re: [Caml-list] New release of Menhir (20141215) Hi, I tried various methods with Merlin, with good results. That's quite close to what you suggest: we add an empty non-terminal and change the behavior when it is on stack. Something like: lexer_switch: | (* empty *) { () } inline: | … | LINK lexer_switch OPEN RAW END OPEN inline* END | … Lexing loop: if has_lexer_switch parser then feed parser (Lexer.raw_token buf) else feed parser (Lexer.token buf) Of course this require introspection (Merlin's internal version exposes a lot more informations, but that's for debugging and experimentation purposes, we hope to clean that). You can still use side-effects to emulate the trick: lexer_switch: | (* empty *) { in_raw_lexer := true } lexer_leave: | (* empty *) { in_raw_lexer := false } … but be careful :). Cheers, Fred On 26/12/2014 12:13, Dario Teixeira wrote: > Hi, > >> Hmm, maybe. The new API will probably allow you to inspect the stack (which is >> basically a list of pairs of a state and a semantic value) and to inspect a >> state (which can be viewed as a set of LR(1) items). I don't know whether that >> would offer you a simple way of deciding when to switch from one lexer to >> another... > I suspect it will *at least* be an improvement over the current situation. > Consider the following rule: > > inline: > | ... > | LINK OPEN RAW END OPEN inline* END > | ... > > Suppose that by default I'm using a 'general' lexer. However, upon encountering > that first OPEN token, I must switch to a 'raw' lexer and then switch back to the > 'general' lexer upon encountering the first END token. This lexer dance won't > happen with the second OPEN token, though. > > Anyway, as long as I know which state Menhir is in, choosing the right lexer > should be an easy task. It may require a large lookup table on my part to map > state to lexer, but at least that's a lot less hairy than the current approach. > > Best regards, > Dario Teixeira >