From: skaller <skaller@users.sourceforge.net>
To: Nathaniel Gray <n8gray@gmail.com>
Cc: Francois.Pottier@inria.fr, Caml Mailing List <caml-list@yquem.inria.fr>
Subject: Re: [Caml-list] [ANNOUNCE] Alpha release of Menhir, an LR(1) parser generator for ocaml
Date: Wed, 14 Dec 2005 17:08:15 +1100 [thread overview]
Message-ID: <1134540495.8980.63.camel@rosella> (raw)
In-Reply-To: <aee06c9e0512131307k3fc494a5k3591d549d552f1b@mail.gmail.com>
On Tue, 2005-12-13 at 13:07 -0800, Nathaniel Gray wrote:
> This is pretty nice! Every time I use ocamlyacc I think "somebody
> should write something better." Now it looks like somebody has! I
> can't tell you how many times I've wanted parameterized rules and
> simple "library" rules for parsing delimiter-separated lists and
> such...
Yes, it is pretty nice! However it still appears to have some
problems. Any comments appreciated.
0. The licence. Q public licence for the generator????
Please NO NO NO!! Not unless it is distributed
as part of the official distro. Is there any chance of that?
If not even GPL would be better ;(
1. Generating a functor is cute, but it doesn't seem to
allow arguments to parser functions. Perhaps I missed something?
Is there a way to use the functorisation with closures to
add an argument?
In particular, can the parser be generated *inside*
an environment such a function or let binding?
[Felix allows that, which means an extra argument is
not required, a variable in the environment can be used
instead]
2. The signature of parsers is still wrong?
Ocamlyacc usesthe typing
val parser: (lexbuf->token) -> lexbuf -> 'a
which is just bad. A better signature is
val parser: ( unit -> token ) -> 'a
There is no need to provide location information: the correct
solution is to throw an exception, which is caught in a
context which can determine the location.
It would be nice to be able to generate this signature
with a command line switch, pragma, or some other mechanism,
even if the default is chosen for ocamlyacc compatibility.
3. I have doubts about the claim that parsers can 'share'
token types. I do not see how this is possible. It is
contradicted by the compilation model description, which
explains how it is necessary to join separate files making
up a grammar specification. In this case, the joined system
is going to generate a single token type, and any type
generated by another joining is certain to generate
a distinct type because
(a) the type is defined in a distinct ocaml module (mli file)
(b) the typing of normal variants is nominal
This problem would go away if polymorphic variants
were used instead, because the typenames are then simply
abbreviations, since pm-variants are structurally, not
nominally, typed.
Perhaps a command line switch, pragma, or whatever, to use
polymorphic variants instead of ordinary ones?
Actually, I personally find the 'yacc' technique of
generating tokens to be rather lame. Felix does this
much better -- the parser simply expects a token type
which is a variant, the type can be defined wherever
you like. In particular, the lexer and parser can
share that definition.
As far as I can see Menhir COULD do this, except of
course one would use %token as a special way
of generating the variant. All that would be required
I think is the syntax
%import_tokens "filename"
which refers to the token definition file -- as an
alternative to inlining these token definitions.
(if pm-variants are used you could probably support both,
though I'm not sure).
A token definition file then generates two files,
an ordinary mli file with the token variant type,
and, a special information file for the parser generator
(with the same information, but in a more useful form).
In Felix none of this is necessary because parsing is
built in, so the compiler can find the information required
for the parser generator directly from the token variant type.
4. Just curious, but how practical is LR(1) in terms of
generated code sizes? Felix is using Elkhound as its
parser which is a GLR parser with an LALR(1) core. In theory
there is an option for choosing the core automaton, which
also allows LR(1) however I recall Scott McPeak commenting
it wasn't worth supporting because it generated tables
which were far too big.
I'm curious how one would be able to predict the size of the
generated code since I don't really understand the
additional constraints LALR(1) introduces ..
--
John Skaller <skaller at users dot sf dot net>
Felix, successor to C++: http://felix.sf.net
next prev parent reply other threads:[~2005-12-14 6:08 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2005-12-12 17:58 Francois Pottier
2005-12-12 19:51 ` [Caml-list] " "Márk S. Zoltán"
2005-12-13 21:07 ` Nathaniel Gray
2005-12-14 6:08 ` skaller [this message]
2005-12-14 9:04 ` Francois Pottier
2005-12-14 10:27 ` Alessandro Baretta
2005-12-14 21:04 ` skaller
2005-12-15 8:46 ` Francois Pottier
2005-12-15 11:03 ` skaller
2005-12-14 20:51 ` skaller
2005-12-14 22:15 ` Joaquin Cuenca Abela
2005-12-15 8:40 ` Francois Pottier
2005-12-15 6:35 ` Stefan Monnier
2005-12-15 8:47 ` [Caml-list] " Francois Pottier
2005-12-15 16:41 ` Stefan Monnier
2005-12-15 16:50 ` Francois Pottier
2005-12-15 18:56 ` Stefan Monnier
2005-12-30 21:57 ` Florian Weimer
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1134540495.8980.63.camel@rosella \
--to=skaller@users.sourceforge.net \
--cc=Francois.Pottier@inria.fr \
--cc=caml-list@yquem.inria.fr \
--cc=n8gray@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox