* There's an elephant in the room: Solution to sharing a symbol table @ 2007-05-01 21:39 Joel Reymont 2007-05-02 5:44 ` [Menhir-list] " Francois Pottier 0 siblings, 1 reply; 10+ messages in thread From: Joel Reymont @ 2007-05-01 21:39 UTC (permalink / raw) To: Caml List; +Cc: menhir-list, skaller %type <Symtab.t -> Easy.program> program: statement EOF { fun t -> List.rev ($1 t) } etc Would this work with Menhir and would it be detrimental to performance? I think there's an elephant in the room. Functorization of a Menhir- generated parser doesn't solve this issue since the specialization happens at compile time whereas a new symbol table is a run-time value. Thanks, Joel -- http://wagerlabs.com/ ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Menhir-list] There's an elephant in the room: Solution to sharing a symbol table 2007-05-01 21:39 There's an elephant in the room: Solution to sharing a symbol table Joel Reymont @ 2007-05-02 5:44 ` Francois Pottier 2007-05-02 5:58 ` Joel Reymont 0 siblings, 1 reply; 10+ messages in thread From: Francois Pottier @ 2007-05-02 5:44 UTC (permalink / raw) To: Joel Reymont; +Cc: Caml List, skaller, menhir-list Hello, On Tue, May 01, 2007 at 10:39:19PM +0100, Joel Reymont wrote: > > %type <Symtab.t -> Easy.program> > > program: statement EOF { fun t -> List.rev ($1 t) } > > Would this work with Menhir and would it be detrimental to performance? Yes, it would work (with ocamlyacc or Menhir). It is in fact a classic idiom. However, you must be aware that having your semantic actions build closures means that all of the actual work (e.g., the invocation of List.rev above) is delayed until *after* the entire parsing process is over. In other words, performance-wise, your code is equivalent to building an abstract syntax tree, without using your symbol table, and then transforming that tree. > I think there's an elephant in the room. Functorization of a Menhir- > generated parser doesn't solve this issue since the specialization > happens at compile time whereas a new symbol table is a run-time value. Functors can be applied at run-time (via "let module"), so a functor parameter *can* provide a parser with runtime information. I would still recommend the %parameter approach, since it is syntactically lighter and more efficient (the semantic actions won't be delayed). -- François Pottier Francois.Pottier@inria.fr http://cristal.inria.fr/~fpottier/ ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Menhir-list] There's an elephant in the room: Solution to sharing a symbol table 2007-05-02 5:44 ` [Menhir-list] " Francois Pottier @ 2007-05-02 5:58 ` Joel Reymont [not found] ` <20070502070345.GA5242@yquem.inria.fr> 2007-05-02 9:04 ` [Caml-list] " skaller 0 siblings, 2 replies; 10+ messages in thread From: Joel Reymont @ 2007-05-02 5:58 UTC (permalink / raw) To: Francois.Pottier; +Cc: Caml List, skaller, menhir-list On May 2, 2007, at 6:44 AM, Francois Pottier wrote: > Functors can be applied at run-time (via "let module"), so a > functor parameter > *can* provide a parser with runtime information. I would still > recommend the > %parameter approach, since it is syntactically lighter and more > efficient (the > semantic actions won't be delayed). I have been butting my head against this for a few days now. Please, please provide an example!!! I'm not talking about a general example of a %parameter approach but an example of using that approach with a lexer. Consider this function that should parser a string and return an AST: let parse str = let lexbuf = Lexing.from_string str in program (token tab) lexbuf The token lexer above takes a tab argument before lexbuf but program always expects (Lexing.lexbuf -> token) so we give program the curried function. How do you write parse above with the %parameter approach to ensure that parse ALWAYS uses a new symbol table that is shared between the lexer and the parser in this parsing run? Thanks, Joel -- http://wagerlabs.com/ ^ permalink raw reply [flat|nested] 10+ messages in thread
[parent not found: <20070502070345.GA5242@yquem.inria.fr>]
* Re: [Menhir-list] There's an elephant in the room: Solution to sharing a symbol table [not found] ` <20070502070345.GA5242@yquem.inria.fr> @ 2007-05-02 7:35 ` Joel Reymont 0 siblings, 0 replies; 10+ messages in thread From: Joel Reymont @ 2007-05-02 7:35 UTC (permalink / raw) To: Francois.Pottier; +Cc: menhir-list, Caml List I'm hurt! It cannot be this easy. Thank you Francois! On May 2, 2007, at 8:03 AM, Francois Pottier wrote: > Well, if the parser is parameterized over "T : sig val tab: ... end", > then you could write something like > > (* assuming "tab" is defined here *) > > let parser str = > let lexbuf = Lexing.from_string str in > let module P = EasyParser.Make (struct let tab = tab end) in > P.program (token tab) lexbuf -- http://wagerlabs.com/ ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Caml-list] Re: [Menhir-list] There's an elephant in the room: Solution to sharing a symbol table 2007-05-02 5:58 ` Joel Reymont [not found] ` <20070502070345.GA5242@yquem.inria.fr> @ 2007-05-02 9:04 ` skaller 2007-05-02 11:56 ` Francois Pottier 1 sibling, 1 reply; 10+ messages in thread From: skaller @ 2007-05-02 9:04 UTC (permalink / raw) To: Joel Reymont; +Cc: Francois.Pottier, menhir-list, Caml List On Wed, 2007-05-02 at 06:58 +0100, Joel Reymont wrote: > On May 2, 2007, at 6:44 AM, Francois Pottier wrote: > How do you write parse above with the %parameter approach to ensure > that parse ALWAYS uses a new symbol table that is shared between the > lexer and the parser in this parsing run? You can't. Don't be confused by functor interface in Menhir. It is useless for maintaining state. What you need is a feature I asked for .. the possibility of passing a state argument to the parser the same way you can now do for ocamllex: you just write rule fred state = parse ... and now 'state' is passed to all actions. That's ocamllex, we need the same for parsing. [I thought this was implemented but can't see it anywhere in the Menhir manual] This IS a way to hack this .. by using the ocamllex feature and cheating by wrapping stuff inside tokens so that the parser user actions can get at the state object. -- John Skaller <skaller at users dot sf dot net> Felix, successor to C++: http://felix.sf.net ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Caml-list] Re: [Menhir-list] There's an elephant in the room: Solution to sharing a symbol table 2007-05-02 9:04 ` [Caml-list] " skaller @ 2007-05-02 11:56 ` Francois Pottier 2007-05-02 16:07 ` skaller 0 siblings, 1 reply; 10+ messages in thread From: Francois Pottier @ 2007-05-02 11:56 UTC (permalink / raw) To: skaller; +Cc: Joel Reymont, menhir-list, Caml List Hello, On Wed, May 02, 2007 at 07:04:50PM +1000, skaller wrote: > You can't. Don't be confused by functor interface in Menhir. > It is useless for maintaining state. I don't think so... > What you need is a feature I asked for .. the possibility of > passing a state argument to the parser the same way you can > now do for ocamllex: you just write > > rule fred state = parse ... I remember our discussion. What you need primarily, if I recall correctly, is the ability for a semantic action to recursively invoke the parser itself. This happens to work if the parser is defined as a recursive function, and does not (currently) work when the parser is defined as a functor, because the functor is not declared as recursive. We could, in principle, make it a recursive functor, which would allow recursive invocations. -- François Pottier Francois.Pottier@inria.fr http://cristal.inria.fr/~fpottier/ ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Caml-list] Re: [Menhir-list] There's an elephant in the room: Solution to sharing a symbol table 2007-05-02 11:56 ` Francois Pottier @ 2007-05-02 16:07 ` skaller 2007-05-02 18:30 ` Francois Pottier 0 siblings, 1 reply; 10+ messages in thread From: skaller @ 2007-05-02 16:07 UTC (permalink / raw) To: Francois.Pottier; +Cc: Caml List, menhir-list On Wed, 2007-05-02 at 13:56 +0200, Francois Pottier wrote: > Hello, > > On Wed, May 02, 2007 at 07:04:50PM +1000, skaller wrote: > > You can't. Don't be confused by functor interface in Menhir. > > It is useless for maintaining state. > > I don't think so... State must be maintained in function arguments. Functors are entirely irrelevant. This is about associating data with a thread of control. Joel asked: "How do you write parse above with the %parameter approach to ensure that parse ALWAYS uses a new symbol table that is shared between the lexer and the parser in this parsing run?" and the answer is you have to pass an argument to the parser function. > > What you need is a feature I asked for .. the possibility of > > passing a state argument to the parser the same way you can > > now do for ocamllex: you just write > > > > rule fred state = parse ... > > I remember our discussion. What you need primarily, if I recall > correctly, is the ability for a semantic action to recursively > invoke the parser itself. This happens to work if the parser is > defined as a recursive function, and does not (currently) work > when the parser is defined as a functor, because the functor is > not declared as recursive. We could, in principle, make it a > recursive functor, which would allow recursive invocations. Let me give you a real example: I have a parser that reads Felix code that looks like: fun f .. // felix stuff include "filename"; fun g .. // When the parser sees the include statement it parses the file 'filename'. The parse of the included file, in Felix, is independent of the surrounding code. The parser is called recursively and the resulting AST is returned from the user action. There is no way to do this without passing a state variable on the stack. In fact Felix uses a variable passed to the lexer to work around the deficiency in Ocamlyacc .. it passes the variable to the parser in a token .. :) [That works with Menhir too by the way] -- John Skaller <skaller at users dot sf dot net> Felix, successor to C++: http://felix.sf.net ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Caml-list] Re: [Menhir-list] There's an elephant in the room: Solution to sharing a symbol table 2007-05-02 16:07 ` skaller @ 2007-05-02 18:30 ` Francois Pottier 2007-05-03 1:17 ` skaller 0 siblings, 1 reply; 10+ messages in thread From: Francois Pottier @ 2007-05-02 18:30 UTC (permalink / raw) To: skaller; +Cc: Caml List, menhir-list Hi, On Thu, May 03, 2007 at 02:07:26AM +1000, skaller wrote: > State must be maintained in function arguments. This is a trivial example, but: let x = ref 0 let f () : bool = incr x; x / 2 = 0 Here, f exploits an internal state, even though it is not explicitly parameterized over a piece of state. Similarly, a parser can have internal state, even if the parsing functions are not explicitly parameterized over a piece of state. It is sufficient for the entire parser either to refer to a global variable or to abstracted over a piece of state (which functors allow). > When the parser sees the include statement it parses the > file 'filename'. The parse of the included file, in Felix, > is independent of the surrounding code. The parser is > called recursively and the resulting AST is returned > from the user action. > > There is no way to do this without passing a state variable > on the stack. I don't see why this is so. In fact, nothing in your example seems to require any kind of state -- you only need the parser to be able to recursively invoke itself. Or am I missing something? -- François Pottier Francois.Pottier@inria.fr http://cristal.inria.fr/~fpottier/ ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Caml-list] Re: [Menhir-list] There's an elephant in the room: Solution to sharing a symbol table 2007-05-02 18:30 ` Francois Pottier @ 2007-05-03 1:17 ` skaller 2007-05-03 8:48 ` Joel Reymont 0 siblings, 1 reply; 10+ messages in thread From: skaller @ 2007-05-03 1:17 UTC (permalink / raw) To: Francois.Pottier; +Cc: Caml List, menhir-list On Wed, 2007-05-02 at 20:30 +0200, Francois Pottier wrote: > Hi, > > On Thu, May 03, 2007 at 02:07:26AM +1000, skaller wrote: > > State must be maintained in function arguments. > > This is a trivial example, but: > > let x = ref 0 > let f () : bool = incr x; x / 2 = 0 Yes, but I rule out global mutable state as unacceptable by requiring reentrancy. > Here, f exploits an internal state, even though it is not explicitly > parameterized over a piece of state. Similarly, a parser can have internal > state, even if the parsing functions are not explicitly parameterized over a > piece of state. It is sufficient for the entire parser either to refer to a > global variable or to abstracted over a piece of state (which functors allow). As above, if that state is global the result isn't re-entrant and isn't acceptable. > I don't see why this is so. In fact, nothing in your example seems to require > any kind of state -- you only need the parser to be able to recursively invoke > itself. Or am I missing something? I didn't show the full parser of course. In fact, Felix grammar is user extensible, so there has to be somewhere to hold the extensions. Similarly, a C parser requires a symbol table for typedefs, and Felix allows embedded C. In fact, Felix can call the FrontC/CIL parser on fragments of C code -- and CIL uses global variables so isn't re-entrant. In fact Joel explained the 'elephant' technique: let parser str = let tab = .. in let lexbuf = Lexing.from_string str in let module P = EasyParser.Make (struct let tab = tab end) in P.program (token tab) lexbuf which uses a local module to turn a global variable into a function local one. That's pretty clever, the function 'parser' here is reentrant (even though P.program is not). To support recursion you'd need: let module P = EasyParser.Make (struct let tab = tab let parser = parser (* pass fun for recursion *) end) in Quite a clever use of a local modules and functors therefore, allows %parameter to suffice, without further changes to Menhir .. though it would still be much easier if you could just do it like Ocamllex, I have to agree this does work. -- John Skaller <skaller at users dot sf dot net> Felix, successor to C++: http://felix.sf.net ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Caml-list] Re: [Menhir-list] There's an elephant in the room: Solution to sharing a symbol table 2007-05-03 1:17 ` skaller @ 2007-05-03 8:48 ` Joel Reymont 0 siblings, 0 replies; 10+ messages in thread From: Joel Reymont @ 2007-05-03 8:48 UTC (permalink / raw) To: skaller; +Cc: Francois.Pottier, Caml List, menhir-list On May 3, 2007, at 2:17 AM, skaller wrote: > In fact Joel explained the 'elephant' technique: The elephant in the room was something that, seemingly, no one talked about, although it was large and glaringly obvious. > let parser str = > let tab = .. in > let lexbuf = Lexing.from_string str in > let module P = EasyParser.Make (struct let tab = tab end) in > P.program (token tab) lexbuf Kudos to Francois to offering the above example. Thanks, Joel -- http://wagerlabs.com/ ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2007-05-03 8:48 UTC | newest] Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2007-05-01 21:39 There's an elephant in the room: Solution to sharing a symbol table Joel Reymont 2007-05-02 5:44 ` [Menhir-list] " Francois Pottier 2007-05-02 5:58 ` Joel Reymont [not found] ` <20070502070345.GA5242@yquem.inria.fr> 2007-05-02 7:35 ` Joel Reymont 2007-05-02 9:04 ` [Caml-list] " skaller 2007-05-02 11:56 ` Francois Pottier 2007-05-02 16:07 ` skaller 2007-05-02 18:30 ` Francois Pottier 2007-05-03 1:17 ` skaller 2007-05-03 8:48 ` Joel Reymont
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox