"nested" parsers

Mailing list for all users of the OCaml language and system.
 help / color / mirror / Atom feed

* "nested" parsers
@ 1999-12-02 13:25 Georges Mariano
  1999-12-03 13:31 ` Hendrik Tews
  1999-12-03 14:32 ` Remi VANICAT
  0 siblings, 2 replies; 4+ messages in thread
From: Georges Mariano @ 1999-12-02 13:25 UTC (permalink / raw)
  To: caml-list

Hello everyone,

We have developed a parser for a language L using ocamllex
and ocamlyacc , thus we have a L.mly 

It appears that we can divide the language L in a few "sub"-languages.
Let's say that  L3 <: L2 <: L1 = L   ('<:' included in )

And we would like to have one entry point for each language
in our parser.
	a) is it possible (i.e  with ocamlyacc) ?
	(with one .mly, with several ??)

	b) how to do that ? (it's not very clear for us ;-)
	(pointers to specifica documentation or examples are
	welcome...)

Thanks for any help

-- 
> Georges MARIANO                           tel: (33) 03 20 43 84 06
> INRETS,                                   fax: (33) 03 20 43 83 59
> Institut National de Recherches sur les Transport et leur Sécurité
> 20 rue Elisee Reclus    
> 59650 Villeneuve d'Ascq                   mailto:mariano@terre.inrets.fr
> FRANCE.                         
> B.U.G http://www3.inrets.fr/BUGhome.html  mailto:bug@estas1.inrets.fr




^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: "nested" parsers
  1999-12-02 13:25 "nested" parsers Georges Mariano
@ 1999-12-03 13:31 ` Hendrik Tews
  1999-12-03 18:18   ` skaller
  1999-12-03 14:32 ` Remi VANICAT
  1 sibling, 1 reply; 4+ messages in thread
From: Hendrik Tews @ 1999-12-03 13:31 UTC (permalink / raw)
  To: caml-list

Georges Mariano writes:
   Date: Thu, 02 Dec 1999 13:25:04 +0000
   Subject: "nested" parsers

   It appears that we can divide the language L in a few "sub"-languages.
   Let's say that  L3 <: L2 <: L1 = L   ('<:' included in )

   And we would like to have one entry point for each language
   in our parser.

ocamlyacc generates a parsing function for echa start symbol that
you declare in the grammar file. Therefore the easiest way (if
possible) is that you rewrite your grammar, such that for each of
the languages Ln you have one metasymbol, which generates this
language. Then you include several start directives in the header
of the grammar file.

Bye,

Hendrik

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: "nested" parsers
  1999-12-02 13:25 "nested" parsers Georges Mariano
  1999-12-03 13:31 ` Hendrik Tews
@ 1999-12-03 14:32 ` Remi VANICAT
  1 sibling, 0 replies; 4+ messages in thread
From: Remi VANICAT @ 1999-12-03 14:32 UTC (permalink / raw)
  To: Georges Mariano; +Cc: caml-list

Georges Mariano <georges.mariano@inrets.fr> writes:

> Hello everyone,
> 
> We have developed a parser for a language L using ocamllex
> and ocamlyacc , thus we have a L.mly 
> 
> It appears that we can divide the language L in a few "sub"-languages.
> Let's say that  L3 <: L2 <: L1 = L   ('<:' included in )
> 
> And we would like to have one entry point for each language
> in our parser.
> 	a) is it possible (i.e  with ocamlyacc) ?
> 	(with one .mly, with several ??)
> 
> 	b) how to do that ? (it's not very clear for us ;-)
> 	(pointers to specifica documentation or examples are
> 	welcome...)

i am not sure of what you want, but you have this in the
documentation: 

%start symbol ...  symbol 
          Declare the given symbols as entry points for the
          grammar. For each entry point, a parsing function with the
          same name is defined in the output module. Non-terminals
          that are not declared as entry points have no such parsing
          function. Start symbols must be given a type with the %type
          directive below.

so for each sub-languages you mai put the corresponding symbol in the
start close




^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: "nested" parsers
  1999-12-03 13:31 ` Hendrik Tews
@ 1999-12-03 18:18   ` skaller
  0 siblings, 0 replies; 4+ messages in thread
From: skaller @ 1999-12-03 18:18 UTC (permalink / raw)
  To: Hendrik Tews; +Cc: caml-list

Hendrik Tews wrote:
> 
> Georges Mariano writes:
>    Date: Thu, 02 Dec 1999 13:25:04 +0000
>    Subject: "nested" parsers
> 
>    It appears that we can divide the language L in a few "sub"-languages.
>    Let's say that  L3 <: L2 <: L1 = L   ('<:' included in )
> 
>    And we would like to have one entry point for each language
>    in our parser.
> 
> ocamlyacc generates a parsing function for echa start symbol that
> you declare in the grammar file. Therefore the easiest way (if
> possible) is that you rewrite your grammar, such that for each of
> the languages Ln you have one metasymbol, which generates this
> language. Then you include several start directives in the header
> of the grammar file.

	But it isn't clear how to invoke these sublanguages
recursively, from with an action associated with a reduce
operation for the very sublanguage which we wanted a separate
parser for.

	For example, the Python language can conveniently
be divided into two distinct languages: a statement language
and an expression language. Certainly, I can have a non-terminal
for statements, and one for expressions, and make both entry
points --- and in my parser for Python I do just that.

	But that isn't what I want to do. 
What I actually want is a recursive transition machine corresponding
to a meta-grammar (RHS of productions can be regular expressions;
for example, BNF]

	This can be organised by a finite state automaton in which
a non-terminal transition pushes down the whole automaton,
and starts a new one corresponding to the RHS of the production
defining the non-terminal labelling the arc being followed.

	Chosing the right arc is the hard bit. If the first sets
of all the arcs out of a node are known, the number of trial
parses can be reduced -- to one, if the first sets are disjoint.


-- 
John Skaller, mailto:skaller@maxtal.com.au
10/1 Toxteth Rd Glebe NSW 2037 Australia
homepage: http://www.maxtal.com.au/~skaller
voice: 61-2-9660-0850




^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~1999-12-03 18:29 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
1999-12-02 13:25 "nested" parsers Georges Mariano
1999-12-03 13:31 ` Hendrik Tews
1999-12-03 18:18   ` skaller
1999-12-03 14:32 ` Remi VANICAT

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox