From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail3-relais-sop.national.inria.fr (mail3-relais-sop.national.inria.fr [192.134.164.104]) by sympa.inria.fr (Postfix) with ESMTPS id 187F37EE88 for ; Sun, 8 May 2016 11:33:51 +0200 (CEST) IronPort-PHdr: 9a23:B4TYkBSpl9F9I7WcsBYbdzf29tpsv+yvbD5Q0YIujvd0So/mwa64YBGN2/xhgRfzUJnB7Loc0qyN4/GmADRLuMjJmUtBWaIPfidNsd8RkQ0kDZzNImzAB9muURYHGt9fXkRu5XCxPBsdMs//Y1rPvi/6tmZKSV3BPAZ4bt74BpTVx5zukbviqtuKO04R2nKUWvBbElaflU3prM4YgI9veO4a6yDihT92QdlQ3n5iPlmJnhzxtY+a9Z9n9DlM6bp6r5YTGfayQ6NtRrVdCHEiMnspzMztrxjKCwWVtVUGVWBDvhNSAg+N0Bz7TprwqCKy4uZ0wiide9H7TKA5WC6rx6FvRQ70hSFBPDk8pjKEwvdshb5W9Ury7yd0xJTZNdmY Authentication-Results: mail3-smtp-sop.national.inria.fr; spf=None smtp.pra=dario.teixeira@nleyten.com; spf=None smtp.mailfrom=dario.teixeira@nleyten.com; spf=None smtp.helo=postmaster@relay4-d.mail.gandi.net Received-SPF: None (mail3-smtp-sop.national.inria.fr: no sender authenticity information available from domain of dario.teixeira@nleyten.com) identity=pra; client-ip=217.70.183.196; receiver=mail3-smtp-sop.national.inria.fr; envelope-from="dario.teixeira@nleyten.com"; x-sender="dario.teixeira@nleyten.com"; x-conformance=sidf_compatible Received-SPF: None (mail3-smtp-sop.national.inria.fr: no sender authenticity information available from domain of dario.teixeira@nleyten.com) identity=mailfrom; client-ip=217.70.183.196; receiver=mail3-smtp-sop.national.inria.fr; envelope-from="dario.teixeira@nleyten.com"; x-sender="dario.teixeira@nleyten.com"; x-conformance=sidf_compatible Received-SPF: None (mail3-smtp-sop.national.inria.fr: no sender authenticity information available from domain of postmaster@relay4-d.mail.gandi.net) identity=helo; client-ip=217.70.183.196; receiver=mail3-smtp-sop.national.inria.fr; envelope-from="dario.teixeira@nleyten.com"; x-sender="postmaster@relay4-d.mail.gandi.net"; x-conformance=sidf_compatible X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: A0C5AQC2Bi9XkMS3RtlehRCmQZRHhy08EAEBAQEBAQEBEQEBAQEHDQkJIS9BEgGBWYJTAlOBRROIFJ4BoBiGIIlXhQ0FmCKBLieMQI8ejzkCN4FhAQoBAQFSgVaGOIFvfwEBAQ X-IPAS-Result: A0C5AQC2Bi9XkMS3RtlehRCmQZRHhy08EAEBAQEBAQEBEQEBAQEHDQkJIS9BEgGBWYJTAlOBRROIFJ4BoBiGIIlXhQ0FmCKBLieMQI8ejzkCN4FhAQoBAQFSgVaGOIFvfwEBAQ X-IronPort-AV: E=Sophos;i="5.24,594,1454972400"; d="scan'208";a="176991136" Received: from relay4-d.mail.gandi.net ([217.70.183.196]) by mail3-smtp-sop.national.inria.fr with ESMTP/TLS/ADH-AES256-GCM-SHA384; 08 May 2016 11:33:50 +0200 Received: from mfilter46-d.gandi.net (mfilter46-d.gandi.net [217.70.178.177]) by relay4-d.mail.gandi.net (Postfix) with ESMTP id 3E25F1720A1 for ; Sun, 8 May 2016 11:33:50 +0200 (CEST) X-Virus-Scanned: Debian amavisd-new at mfilter46-d.gandi.net Received: from relay4-d.mail.gandi.net ([IPv6:::ffff:217.70.183.196]) by mfilter46-d.gandi.net (mfilter46-d.gandi.net [::ffff:10.0.15.180]) (amavisd-new, port 10024) with ESMTP id pVx0bMjuvbV7 for ; Sun, 8 May 2016 11:33:48 +0200 (CEST) X-Originating-IP: 10.58.1.141 Received: from webmail.gandi.net (webmail1-d.mgt.gandi.net [10.58.1.141]) (Authenticated sender: dario.teixeira@nleyten.com) by relay4-d.mail.gandi.net (Postfix) with ESMTPA id 9FF5817209D for ; Sun, 8 May 2016 11:33:48 +0200 (CEST) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII; format=flowed Content-Transfer-Encoding: 7bit Date: Sun, 08 May 2016 10:33:48 +0100 From: Dario Teixeira To: caml-list@inria.fr Message-ID: <0a49598f1e0c8838fa69cd4d803af83d@nleyten.com> X-Sender: dario.teixeira@nleyten.com User-Agent: Roundcube Webmail/1.1.2 Subject: [Caml-list] Menhir grammar with sequences delimited by same token Hi, (Sending this to Caml-list because Menhir-list is currently down.) I've come across an interesting parsing problem, one for which I wonder if there is a succinct solution in Menhir. Suppose I want to parse a markup which uses the same token for delimiting *both* the beginning and the termination of a bold sequence (and likewise for an emph sequence). Basically this: inline: | TEXT {Ast.Text $1} | BOLD inline* BOLD {Ast.Bold $2} | EMPH inline* EMPH {Ast.Emph $2} Which of course has a shift/reduce conflict: if the token stream is [BOLD; TEXT; BOLD; ...], what should the parser do upon encountering the second BOLD -- start a new nesting level, or close the current one? I can force the latter behaviour by rearranging the grammar so that an inline sequence within BOLDs cannot contain BOLD itself, and likewise for EMPH: inline: | TEXT {Ast.Text $1} | BOLD inline_sans_bold* BOLD {Ast.Bold $2} | EMPH inline_sans_emph* EMPH {Ast.Emph $2} inline_sans_bold: | TEXT {Ast.Text $1} | EMPH inline_sans_emph* EMPH {Ast.Emph $2} inline_sans_emph: | TEXT {Ast.Text $1} | BOLD inline_sans_bold* BOLD {Ast.Bold $2} For this simple example this approach is feasible, but blows up into silliness for a real-world case where besides BOLD and EMPH I have many other similar tokens. Does Menhir offer a more succinct solution to this problem? (I reckon using the priority mechanism somehow, but exactly how eludes me.) Thanks in advance for your time! Best regards, Dario Teixeira