* GenLex stream parsers too eager?
@ 1999-01-26 17:33 David McClain
1999-01-26 19:17 ` mattwb
1999-02-01 13:13 ` Xavier Leroy
0 siblings, 2 replies; 3+ messages in thread
From: David McClain @ 1999-01-26 17:33 UTC (permalink / raw)
To: Liste CAML
It appears that the Genlex derived parsers always eagerly tokenize =
negaitve integer and float constants. This causes incorrect behavior in =
closely spaced code (no-spaces):
a-2*c --> parses as "a", "-2" ,"*", "c" instead of =
"a","-","2","*","c"
So instead of getting one expression tree, I get two, with the first =
containing only "a".
Also, if the operator were exponentiation instead of multiplication, the =
second tree would incorrectly compute a (possibly) complex valued =
expression instead of a simple negative of a real expression.
I have tried various workarounds, but they really obfuscate the original =
recursive descent structure of parsers.
Any suggestions? (Perhaps I should be using OCAMLLEX and OCAMLYACC =
instead?)
David McClain
Sr. Scientist
Raytheon Missile Systems Co.
Tucson, AZ
http://www.azstarnet.com/~dmcclain/homepage.htm
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: GenLex stream parsers too eager?
1999-01-26 17:33 GenLex stream parsers too eager? David McClain
@ 1999-01-26 19:17 ` mattwb
1999-02-01 13:13 ` Xavier Leroy
1 sibling, 0 replies; 3+ messages in thread
From: mattwb @ 1999-01-26 19:17 UTC (permalink / raw)
To: dmcclain; +Cc: caml-list
>
>From: "David McClain" <dmcclain@azstarnet.com>
>
>It appears that the Genlex derived parsers always eagerly tokenize =
>negaitve integer and float constants. This causes incorrect behavior in =
>closely spaced code (no-spaces):
>
> a-2*c --> parses as "a", "-2" ,"*", "c" instead of =
>"a","-","2","*","c"
>
>So instead of getting one expression tree, I get two, with the first =
>containing only "a".
>Also, if the operator were exponentiation instead of multiplication, the =
>second tree would incorrectly compute a (possibly) complex valued =
>expression instead of a simple negative of a real expression.
>
>I have tried various workarounds, but they really obfuscate the original =
>recursive descent structure of parsers.
>
>Any suggestions? (Perhaps I should be using OCAMLLEX and OCAMLYACC =
>instead?)
>
I'd certainly suggest not using Genlex except for only the most trivial
things. Ocamllex and ocamlyacc are not difficult and they're much more
powerful.
>David McClain
>Sr. Scientist
>Raytheon Missile Systems Co.
>Tucson, AZ
>http://www.azstarnet.com/~dmcclain/homepage.htm
>
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: GenLex stream parsers too eager?
1999-01-26 17:33 GenLex stream parsers too eager? David McClain
1999-01-26 19:17 ` mattwb
@ 1999-02-01 13:13 ` Xavier Leroy
1 sibling, 0 replies; 3+ messages in thread
From: Xavier Leroy @ 1999-02-01 13:13 UTC (permalink / raw)
To: David McClain, Liste CAML
> It appears that the Genlex derived parsers always eagerly tokenize
> negaitve integer and float constants. This causes incorrect behavior
> in closely spaced code (no-spaces):
>
> a-2*c --> parses as "a", "-2" ,"*", "c" instead of "a","-","2","*","c"
>
Right. This is a classic compiler problem: one can either tokenize
negative integer literals in the lexer (-?[0-9]+), which causes the
weird behavior above for expressions without spaces, or have the lexer
tokenize only positive integer literals ([0-9]+) and add a special
case in the parser to recognize "-" followed by an integer literal.
Genlex is very simple-minded and follows the former approach.
The Caml compilers follow the latter.
(The latter approach has its own problems. For instance, in Caml,
it parses "f -1" as "f minus 1", not as "f applied to the integer -1",
like many users expect.)
> Any suggestions? (Perhaps I should be using OCAMLLEX and OCAMLYACC instead?)
You'll have to write your own lexer, indeed. You can either use ocamllex
to generate it, or start with the source code of the Genlex module
and customize it to your needs.
Best regards,
- Xavier Leroy
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~1999-02-03 11:08 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
1999-01-26 17:33 GenLex stream parsers too eager? David McClain
1999-01-26 19:17 ` mattwb
1999-02-01 13:13 ` Xavier Leroy
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox