* re-entrant CAMLYACC parsers?
@ 1999-08-03 21:45 chet
1999-08-13 19:52 ` Gerd Stolpmann
0 siblings, 1 reply; 3+ messages in thread
From: chet @ 1999-08-03 21:45 UTC (permalink / raw)
To: caml-list
How do most people implement re-entrancy in caml-yacc parsers which
must manipulate side-state? E.g., if I want to have some hashtable to
store a mapping from name to right-hand-side, which the lexer will
expand dynamically and push back onto the lexbuf (I've enclosed the
code I use to do that -- it looks right, and it works, but I'm not
certain that I got the semantics of Lexing.lexbuf down), it'd be nice
to store that someplace on the stack, so that I don't have to count on
serialization of calls to the parser.
I've thought about hacking Parsing to have an extra slot in the
parsing_env record, of polymorphic type, and then adding some syntax
to the caml-yacc language to fetch that value.
Anybody done anything like this?
--chet--
P.S. I realize I could do this trivially with stream-parsers, but I'd
prefer to write a yacc-parser for efficiency and declarativeness.
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: re-entrant CAMLYACC parsers?
1999-08-03 21:45 re-entrant CAMLYACC parsers? chet
@ 1999-08-13 19:52 ` Gerd Stolpmann
1999-08-13 21:18 ` chet
0 siblings, 1 reply; 3+ messages in thread
From: Gerd Stolpmann @ 1999-08-13 19:52 UTC (permalink / raw)
To: chet, caml-list
On Tue, 03 Aug 1999, chet@watson.ibm.com wrote:
>How do most people implement re-entrancy in caml-yacc parsers which
>must manipulate side-state? E.g., if I want to have some hashtable to
>store a mapping from name to right-hand-side, which the lexer will
>expand dynamically and push back onto the lexbuf (I've enclosed the
>code I use to do that -- it looks right, and it works, but I'm not
>certain that I got the semantics of Lexing.lexbuf down), it'd be nice
>to store that someplace on the stack, so that I don't have to count on
>serialization of calls to the parser.
>
>I've thought about hacking Parsing to have an extra slot in the
>parsing_env record, of polymorphic type, and then adding some syntax
>to the caml-yacc language to fetch that value.
>
>Anybody done anything like this?
>
I had recently the same problem for my XML parser (BTW: will be available
soon). The idea is simple: make the parser an object!
By default, ocamlyacc creates a module containing some definitions (some
tables, some functions). More or less the module looks like
type token = <definition of the token type>
open Parsing
<Here the header between %{ and %} verbatim>
let yytransl_const = ...
let yytransl_block = ...
let yylhs = ...
let yylen = ...
let yydefred = ...
let yydgoto = ...
let yysindex = ...
let yyrindex = ...
let yygindex = ...
let yytablesize = ...
let yytable = ...
let yycheck = ...
let yyact = =
< an array of functions representing the actions of the rules >
let yytables = ...
< for every start symbol another let <symbol> = ... definition >
< now the trailer after %% verbatim >
To make this an object, the generated code must be postprocessed, for example
by some sed scripts: Some of the "let" definitions should be changed into "val"
definitions, some should be turned into "method" definitions, and of course the
invocations of the methods must be changed. It seems to be sufficient to
change "yyact", "yytables", and all functions representing start symbols into
methods, and let all other symbols be "val"-defined instance variables.
For me, the following "sed" script works:
sed -e 's/^let yyact /method xxact /g' \
-e 's/^let yytables /method xxtables /g' \
-e 's/^let yy/val yy/g' \
-e 's/^let ext_/method ext_/g' \
-e 's/yytablesize/zztablesize/g' \
-e 's/yyact/(self#yyact)/g' \
-e 's/yytables/(self#yytables)/g' \
-e 's/xxact/yyact/g' \
-e 's/xxtables/yytables/g' \
-e 's/zztablesize/yytablesize/g' \
markup_yacc.ml0 >>markup_yacc.ml
It assumes that "markup_yacc.ml0" is the name of the output file of ocamlyacc,
and that all start symbols begin with "ext_".
Next, the header and trailer must introduce the class we are defining. For
example, my header contains
class ['ext] parser_object <some more variables> =
object (self)
<Some more "val" variables, some more methods>
The trailer contains the "end" keyword that corresponds to the "class":
end
The additional instance variables store the state of every parser object;
this should solve your problem. Note that my parser has a type parameter
'ext, another additional possibility of this solution.
A problem is that ocamlyacc also generates a module interface, and that the
generated interface is now wrong. This means that you must write some additional
scripts that extract the "type token" definition from the generated interface
and adds the class interface. The following "awk" script extracts "token":
awk 'BEGIN { copy = 0; }\
/^type/ { copy = 1; }\
/^val/ { copy = 0; }\
{ if (copy) print $0 }' <markup_yacc.mli >markup_yacc_token.mlf
Note that it is important that the generated definition of "token" is used
(even the order of the variants counts); otherwise the wrong tokens are
recognized.
Of course, it would be best if "ocamlyacc" had a "-class" option that generates
the class for you...
BTW: ocamlyacc has a bug that leads to wrong line numbers in error messages of
ocamlc if the error is in the trailer.
Gerd
--
----------------------------------------------------------------------------
Gerd Stolpmann Telefon: +49 6151 997705 (privat)
Viktoriastr. 100
64293 Darmstadt EMail: Gerd.Stolpmann@darmstadt.netsurf.de (privat)
Germany
----------------------------------------------------------------------------
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: re-entrant CAMLYACC parsers?
1999-08-13 19:52 ` Gerd Stolpmann
@ 1999-08-13 21:18 ` chet
0 siblings, 0 replies; 3+ messages in thread
From: chet @ 1999-08-13 21:18 UTC (permalink / raw)
To: Gerd.Stolpmann; +Cc: chet, caml-list
How amusing! I'm hacking an XML parser in CAML, too!
It sure was easy!
I ended up hacking caml-yacc to do it -- added a slot in the
parser_env, and a method to fetch the slot in parser actions.
--chet--
>>>>> "GS" == Gerd Stolpmann <Gerd.Stolpmann@darmstadt.netsurf.de> writes:
GS> On Tue, 03 Aug 1999, chet@watson.ibm.com wrote:
>> How do most people implement re-entrancy in caml-yacc parsers
>> which must manipulate side-state? E.g., if I want to have some
>> hashtable to store a mapping from name to right-hand-side,
>> which the lexer will expand dynamically and push back onto the
>> lexbuf (I've enclosed the code I use to do that -- it looks
>> right, and it works, but I'm not certain that I got the
>> semantics of Lexing.lexbuf down), it'd be nice to store that
>> someplace on the stack, so that I don't have to count on
>> serialization of calls to the parser.
>>
>> I've thought about hacking Parsing to have an extra slot in the
>> parsing_env record, of polymorphic type, and then adding some
>> syntax to the caml-yacc language to fetch that value.
>>
>> Anybody done anything like this?
>>
GS> I had recently the same problem for my XML parser (BTW: will
GS> be available soon). The idea is simple: make the parser an
GS> object!
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~1999-08-22 18:15 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
1999-08-03 21:45 re-entrant CAMLYACC parsers? chet
1999-08-13 19:52 ` Gerd Stolpmann
1999-08-13 21:18 ` chet
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox