* ocamlyacc/ocamllex problems
@ 2001-01-31 17:16 Laurent Reveillere
2001-02-01 14:22 ` Xavier Leroy
0 siblings, 1 reply; 2+ messages in thread
From: Laurent Reveillere @ 2001-01-31 17:16 UTC (permalink / raw)
To: caml-list
I am writing a parser that uses Parsing.rhs_start and Parsing.rhs_end in
a rule.
The problem is the following,
1) If I use a simple rule in the lexer that matches a token all is fine.
ex:
| "'" ['0' '1' '*' '.']+ "'" { ... }
2) If I use an automata in the lexer for matching the same token, the
results of Parsing.rhs_start and Parsing.rhs_end are wrong.
ex:
| "'" { ... bits lexbuf ... }
and bits = parse
| '\'' { ... }
| ['0' '1' '.' '*' ] { ... }
| eof { ... }
| _ { ... }
Here is the ouput of a debug printf for rhs_start and rhs_end values
case 1) Debug: pats='1..00000' at (964,965)
case 2) Debug: pats='1..00000' at (955,965)
I am not sure to undertand the reasons of my problem?
--
Laurent
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: ocamlyacc/ocamllex problems
2001-01-31 17:16 ocamlyacc/ocamllex problems Laurent Reveillere
@ 2001-02-01 14:22 ` Xavier Leroy
0 siblings, 0 replies; 2+ messages in thread
From: Xavier Leroy @ 2001-02-01 14:22 UTC (permalink / raw)
To: Laurent Reveillere; +Cc: caml-list
> I am writing a parser that uses Parsing.rhs_start and Parsing.rhs_end in
> a rule. The problem is the following,
>
> 1) If I use a simple rule in the lexer that matches a token all is fine.
> ex:
> | "'" ['0' '1' '*' '.']+ "'" { ... }
>
> 2) If I use an automata in the lexer for matching the same token, the
> results of Parsing.rhs_start and Parsing.rhs_end are wrong.
> ex:
> | "'" { ... bits lexbuf ... }
> and bits = parse
> | '\'' { ... }
> | ['0' '1' '.' '*' ] { ... }
> | eof { ... }
> | _ { ... }
>
> I am not sure to undertand the reasons of my problem?
For terminal symbols (tokens), the locations returned by
Parsing.rhs_start and Parsing.rhs_end are those returned by
Lexing.lexeme_start and Lexing.lexeme_end. However, these two
functions track the location of the *last* regular expression matched by
the ocamllex-generated automaton. (This location is stored and
updated in place in the "lexbuf" argument.)
So, if your lexing rule recursively calls other lexing rules (as in
case 2 above), the locations reported correspond to the part of the
token that was last matched by a regular expression (i.e. the last
"bit" of the token in your example 2).
To get correct locations in example 2, a bit of "lexbuf" hacking is
required to restore the start location to what it was when the first
regexp was matched:
| "'" { let start = Lexing.lexeme_start lexbuf in
let res = ... bits lexbuf ... in
lexbuf.Lexing.lex_start_pos <- start - lexbuf.Lexing.lex_abs_pos;
res }
and bits = parse ...
Hope this helps,
- Xavier Leroy
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2001-02-02 15:23 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2001-01-31 17:16 ocamlyacc/ocamllex problems Laurent Reveillere
2001-02-01 14:22 ` Xavier Leroy
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox