* "nested" parsers
@ 1999-12-02 13:25 Georges Mariano
1999-12-03 13:31 ` Hendrik Tews
1999-12-03 14:32 ` Remi VANICAT
0 siblings, 2 replies; 4+ messages in thread
From: Georges Mariano @ 1999-12-02 13:25 UTC (permalink / raw)
To: caml-list
Hello everyone,
We have developed a parser for a language L using ocamllex
and ocamlyacc , thus we have a L.mly
It appears that we can divide the language L in a few "sub"-languages.
Let's say that L3 <: L2 <: L1 = L ('<:' included in )
And we would like to have one entry point for each language
in our parser.
a) is it possible (i.e with ocamlyacc) ?
(with one .mly, with several ??)
b) how to do that ? (it's not very clear for us ;-)
(pointers to specifica documentation or examples are
welcome...)
Thanks for any help
--
> Georges MARIANO tel: (33) 03 20 43 84 06
> INRETS, fax: (33) 03 20 43 83 59
> Institut National de Recherches sur les Transport et leur Sécurité
> 20 rue Elisee Reclus
> 59650 Villeneuve d'Ascq mailto:mariano@terre.inrets.fr
> FRANCE.
> B.U.G http://www3.inrets.fr/BUGhome.html mailto:bug@estas1.inrets.fr
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: "nested" parsers
1999-12-02 13:25 "nested" parsers Georges Mariano
@ 1999-12-03 13:31 ` Hendrik Tews
1999-12-03 18:18 ` skaller
1999-12-03 14:32 ` Remi VANICAT
1 sibling, 1 reply; 4+ messages in thread
From: Hendrik Tews @ 1999-12-03 13:31 UTC (permalink / raw)
To: caml-list
Georges Mariano writes:
Date: Thu, 02 Dec 1999 13:25:04 +0000
Subject: "nested" parsers
It appears that we can divide the language L in a few "sub"-languages.
Let's say that L3 <: L2 <: L1 = L ('<:' included in )
And we would like to have one entry point for each language
in our parser.
ocamlyacc generates a parsing function for echa start symbol that
you declare in the grammar file. Therefore the easiest way (if
possible) is that you rewrite your grammar, such that for each of
the languages Ln you have one metasymbol, which generates this
language. Then you include several start directives in the header
of the grammar file.
Bye,
Hendrik
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: "nested" parsers
1999-12-02 13:25 "nested" parsers Georges Mariano
1999-12-03 13:31 ` Hendrik Tews
@ 1999-12-03 14:32 ` Remi VANICAT
1 sibling, 0 replies; 4+ messages in thread
From: Remi VANICAT @ 1999-12-03 14:32 UTC (permalink / raw)
To: Georges Mariano; +Cc: caml-list
Georges Mariano <georges.mariano@inrets.fr> writes:
> Hello everyone,
>
> We have developed a parser for a language L using ocamllex
> and ocamlyacc , thus we have a L.mly
>
> It appears that we can divide the language L in a few "sub"-languages.
> Let's say that L3 <: L2 <: L1 = L ('<:' included in )
>
> And we would like to have one entry point for each language
> in our parser.
> a) is it possible (i.e with ocamlyacc) ?
> (with one .mly, with several ??)
>
> b) how to do that ? (it's not very clear for us ;-)
> (pointers to specifica documentation or examples are
> welcome...)
i am not sure of what you want, but you have this in the
documentation:
%start symbol ... symbol
Declare the given symbols as entry points for the
grammar. For each entry point, a parsing function with the
same name is defined in the output module. Non-terminals
that are not declared as entry points have no such parsing
function. Start symbols must be given a type with the %type
directive below.
so for each sub-languages you mai put the corresponding symbol in the
start close
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: "nested" parsers
1999-12-03 13:31 ` Hendrik Tews
@ 1999-12-03 18:18 ` skaller
0 siblings, 0 replies; 4+ messages in thread
From: skaller @ 1999-12-03 18:18 UTC (permalink / raw)
To: Hendrik Tews; +Cc: caml-list
Hendrik Tews wrote:
>
> Georges Mariano writes:
> Date: Thu, 02 Dec 1999 13:25:04 +0000
> Subject: "nested" parsers
>
> It appears that we can divide the language L in a few "sub"-languages.
> Let's say that L3 <: L2 <: L1 = L ('<:' included in )
>
> And we would like to have one entry point for each language
> in our parser.
>
> ocamlyacc generates a parsing function for echa start symbol that
> you declare in the grammar file. Therefore the easiest way (if
> possible) is that you rewrite your grammar, such that for each of
> the languages Ln you have one metasymbol, which generates this
> language. Then you include several start directives in the header
> of the grammar file.
But it isn't clear how to invoke these sublanguages
recursively, from with an action associated with a reduce
operation for the very sublanguage which we wanted a separate
parser for.
For example, the Python language can conveniently
be divided into two distinct languages: a statement language
and an expression language. Certainly, I can have a non-terminal
for statements, and one for expressions, and make both entry
points --- and in my parser for Python I do just that.
But that isn't what I want to do.
What I actually want is a recursive transition machine corresponding
to a meta-grammar (RHS of productions can be regular expressions;
for example, BNF]
This can be organised by a finite state automaton in which
a non-terminal transition pushes down the whole automaton,
and starts a new one corresponding to the RHS of the production
defining the non-terminal labelling the arc being followed.
Chosing the right arc is the hard bit. If the first sets
of all the arcs out of a node are known, the number of trial
parses can be reduced -- to one, if the first sets are disjoint.
--
John Skaller, mailto:skaller@maxtal.com.au
10/1 Toxteth Rd Glebe NSW 2037 Australia
homepage: http://www.maxtal.com.au/~skaller
voice: 61-2-9660-0850
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~1999-12-03 18:29 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
1999-12-02 13:25 "nested" parsers Georges Mariano
1999-12-03 13:31 ` Hendrik Tews
1999-12-03 18:18 ` skaller
1999-12-03 14:32 ` Remi VANICAT
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox