Re: ocamlyacc/ocamllex problems

Mailing list for all users of the OCaml language and system.
 help / color / mirror / Atom feed

From: Xavier Leroy <Xavier.Leroy@inria.fr>
To: Laurent Reveillere <Laurent.Reveillere@labri.u-bordeaux.fr>
Cc: caml-list@inria.fr
Subject: Re: ocamlyacc/ocamllex problems
Date: Thu, 1 Feb 2001 15:22:06 +0100	[thread overview]
Message-ID: <20010201152206.A30653@pauillac.inria.fr> (raw)
In-Reply-To: <3A784863.78765C0@labri.u-bordeaux.fr>; from Laurent.Reveillere@labri.u-bordeaux.fr on Wed, Jan 31, 2001 at 06:16:19PM +0100

> I am writing a parser that uses Parsing.rhs_start and Parsing.rhs_end in
> a rule.  The problem is the following,
> 
> 1) If I use a simple rule in the lexer that matches a token all is fine.
> ex:
>     |   "'" ['0' '1' '*' '.']+ "'"  { ... }
> 
> 2) If I use an automata in the lexer for matching the same token, the
> results of Parsing.rhs_start and Parsing.rhs_end are wrong.
> ex:	
>     | "'"  { ... bits lexbuf ... }
> and bits = parse
>     | '\'' { ... }
>     | ['0' '1' '.' '*' ] { ... }
>     | eof  { ... }
>     | _    { ... }
>
> I am not sure to undertand the reasons of my problem?

For terminal symbols (tokens), the locations returned by
Parsing.rhs_start and Parsing.rhs_end are those returned by
Lexing.lexeme_start and Lexing.lexeme_end.  However, these two
functions track the location of the *last* regular expression matched by
the ocamllex-generated automaton.  (This location is stored and
updated in place in the "lexbuf" argument.)

So, if your lexing rule recursively calls other lexing rules (as in
case 2 above), the locations reported correspond to the part of the
token that was last matched by a regular expression (i.e. the last
"bit" of the token in your example 2).

To get correct locations in example 2, a bit of "lexbuf" hacking is
required to restore the start location to what it was when the first
regexp was matched:

| "'"  { let start = Lexing.lexeme_start lexbuf in
         let res = ... bits lexbuf ... in
         lexbuf.Lexing.lex_start_pos <- start - lexbuf.Lexing.lex_abs_pos;
         res }
and bits = parse ...

Hope this helps,

- Xavier Leroy

     prev parent reply	other threads:[~2001-02-02 15:23 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2001-01-31 17:16 Laurent Reveillere
2001-02-01 14:22 ` Xavier Leroy [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20010201152206.A30653@pauillac.inria.fr \
    --to=xavier.leroy@inria.fr \
    --cc=Laurent.Reveillere@labri.u-bordeaux.fr \
    --cc=caml-list@inria.fr \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox