From: Xavier Leroy <Xavier.Leroy@inria.fr>
To: Laurent Reveillere <Laurent.Reveillere@labri.u-bordeaux.fr>
Cc: caml-list@inria.fr
Subject: Re: ocamlyacc/ocamllex problems
Date: Thu, 1 Feb 2001 15:22:06 +0100 [thread overview]
Message-ID: <20010201152206.A30653@pauillac.inria.fr> (raw)
In-Reply-To: <3A784863.78765C0@labri.u-bordeaux.fr>; from Laurent.Reveillere@labri.u-bordeaux.fr on Wed, Jan 31, 2001 at 06:16:19PM +0100
> I am writing a parser that uses Parsing.rhs_start and Parsing.rhs_end in
> a rule. The problem is the following,
>
> 1) If I use a simple rule in the lexer that matches a token all is fine.
> ex:
> | "'" ['0' '1' '*' '.']+ "'" { ... }
>
> 2) If I use an automata in the lexer for matching the same token, the
> results of Parsing.rhs_start and Parsing.rhs_end are wrong.
> ex:
> | "'" { ... bits lexbuf ... }
> and bits = parse
> | '\'' { ... }
> | ['0' '1' '.' '*' ] { ... }
> | eof { ... }
> | _ { ... }
>
> I am not sure to undertand the reasons of my problem?
For terminal symbols (tokens), the locations returned by
Parsing.rhs_start and Parsing.rhs_end are those returned by
Lexing.lexeme_start and Lexing.lexeme_end. However, these two
functions track the location of the *last* regular expression matched by
the ocamllex-generated automaton. (This location is stored and
updated in place in the "lexbuf" argument.)
So, if your lexing rule recursively calls other lexing rules (as in
case 2 above), the locations reported correspond to the part of the
token that was last matched by a regular expression (i.e. the last
"bit" of the token in your example 2).
To get correct locations in example 2, a bit of "lexbuf" hacking is
required to restore the start location to what it was when the first
regexp was matched:
| "'" { let start = Lexing.lexeme_start lexbuf in
let res = ... bits lexbuf ... in
lexbuf.Lexing.lex_start_pos <- start - lexbuf.Lexing.lex_abs_pos;
res }
and bits = parse ...
Hope this helps,
- Xavier Leroy
prev parent reply other threads:[~2001-02-02 15:23 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2001-01-31 17:16 Laurent Reveillere
2001-02-01 14:22 ` Xavier Leroy [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20010201152206.A30653@pauillac.inria.fr \
--to=xavier.leroy@inria.fr \
--cc=Laurent.Reveillere@labri.u-bordeaux.fr \
--cc=caml-list@inria.fr \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox