* [Caml-list] Lexing.lexeme_start_p broken? @ 2004-09-17 22:07 Scott Duckworth 2004-09-20 9:23 ` Jean-Christophe Filliatre 0 siblings, 1 reply; 5+ messages in thread From: Scott Duckworth @ 2004-09-17 22:07 UTC (permalink / raw) To: caml-list I can't seem to get the function Lexing.lexeme_start_p to return a position with correct information in it. Here is my code (test.mll): { open Lexing } rule scan = parse eof { raise End_of_file } | _ as x { let pos = lexeme_start_p lexbuf in Printf.printf "ASCII %d at line %d col %d\n" (int_of_char x) pos.pos_lnum pos.pos_bol; scan lexbuf } { try scan (from_channel stdin) with End_of_file -> () } I do the following, but I always get incorrect output: [duckwos@chef]$ ocamllex test.mll 3 states, 257 transitions, table size 1046 bytes [duckwos@chef]$ ocamlc -o test test.ml [duckwos@chef]$ ./test << EOF > position > does not > change > EOF ASCII 112 at line 1 col 0 ASCII 111 at line 1 col 0 ASCII 115 at line 1 col 0 ASCII 105 at line 1 col 0 ASCII 116 at line 1 col 0 ASCII 105 at line 1 col 0 ASCII 111 at line 1 col 0 ASCII 110 at line 1 col 0 ASCII 10 at line 1 col 0 ASCII 100 at line 1 col 0 ASCII 111 at line 1 col 0 ASCII 101 at line 1 col 0 ASCII 115 at line 1 col 0 ASCII 32 at line 1 col 0 ASCII 110 at line 1 col 0 ASCII 111 at line 1 col 0 ASCII 116 at line 1 col 0 ASCII 10 at line 1 col 0 ASCII 99 at line 1 col 0 ASCII 104 at line 1 col 0 ASCII 97 at line 1 col 0 ASCII 110 at line 1 col 0 ASCII 103 at line 1 col 0 ASCII 101 at line 1 col 0 ASCII 10 at line 1 col 0 [duckwos@chef]$ Any ideas why this is happening? I get the same results even if I am reading from an actual file opened with Lexing.from_file (open_in "filename"). One more thing. How does the pos_fname field in the position type get it's value if there is no way for the Lexing module to know the file name? Am I missing something here? Thanks in advance! -- Scott Duckworth ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Caml-list] Lexing.lexeme_start_p broken? 2004-09-17 22:07 [Caml-list] Lexing.lexeme_start_p broken? Scott Duckworth @ 2004-09-20 9:23 ` Jean-Christophe Filliatre 2004-09-20 14:44 ` skaller 0 siblings, 1 reply; 5+ messages in thread From: Jean-Christophe Filliatre @ 2004-09-20 9:23 UTC (permalink / raw) To: Scott Duckworth; +Cc: caml-list Scott Duckworth writes: > I can't seem to get the function Lexing.lexeme_start_p to return a > position with correct information in it. Here is my code (test.mll): In the Ocaml manual, in the documentation of the Lexing module, you can read: "Note that the lexing engine will only manage the pos_cnum field of lex_curr_p by updating it with the number of characters read since the start of the lexbuf. For the other fields to be accurate, they must be initialised before the first use of the lexbuf, and updated by the lexer actions." (below the "type lexbuf = ..."). To update these fields, the best way is to look into ocaml sources to see how this is done is ocaml's own parser. In file parsing/lexer.mll the function update_loc is doing the job, being called each time a newline character is read. It is quite complicated, because it handles many different things at the same time, but to update the fields pos_lnum and pos_bol, it can be simplified to let update_loc lexbuf = let pos = lexbuf.lex_curr_p in lexbuf.lex_curr_p <- { pos with pos_lnum = pos.pos_lnum + 1; pos_bol = pos.pos_cnum } then you call this function for each newline in your lexer actions, e.g. | '\n' { newline lexbuf; token lexbuf } Hope this helps, -- Jean-Christophe Filliâtre (http://www.lri.fr/~filliatr) ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Caml-list] Lexing.lexeme_start_p broken? 2004-09-20 9:23 ` Jean-Christophe Filliatre @ 2004-09-20 14:44 ` skaller 2004-09-21 8:26 ` Damien Doligez 0 siblings, 1 reply; 5+ messages in thread From: skaller @ 2004-09-20 14:44 UTC (permalink / raw) To: Jean-Christophe Filliatre; +Cc: caml-list On Mon, 2004-09-20 at 19:23, Jean-Christophe Filliatre wrote: > simplified to > > let update_loc lexbuf = > let pos = lexbuf.lex_curr_p in > lexbuf.lex_curr_p <- > { pos with pos_lnum = pos.pos_lnum + 1; pos_bol = pos.pos_cnum } > > then you call this function for each newline in your lexer actions, e.g. > > | '\n' > { newline lexbuf; token lexbuf } > > Hope this helps, How does that help, if the tokeniser isn't using the lexbuf? Here's my parser: let parse_tokens (parser:'a parser_t) (tokens: Flx_parse.token list) = let toker = (new tokeniser tokens) in try parser (toker#token_src) (Lexing.from_string "dummy" ) with _ -> toker#report_syntax_error; raise (Flx_exceptions.ParseError "Parsing Tokens") The token supplying function never looks at the lexbuf. The parser does, to report errors, so I have to trash the parser exceptions, since the locations are wrong. -- John Skaller, mailto:skaller@users.sf.net voice: 061-2-9660-0850, snail: PO BOX 401 Glebe NSW 2037 Australia Checkout the Felix programming language http://felix.sf.net ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Caml-list] Lexing.lexeme_start_p broken? 2004-09-20 14:44 ` skaller @ 2004-09-21 8:26 ` Damien Doligez 2004-09-21 9:25 ` skaller 0 siblings, 1 reply; 5+ messages in thread From: Damien Doligez @ 2004-09-21 8:26 UTC (permalink / raw) To: skaller; +Cc: caml-list On Sep 20, 2004, at 16:44, skaller wrote: > How does that help, if the tokeniser isn't using the lexbuf? > Here's my parser: > > let parse_tokens (parser:'a parser_t) (tokens: Flx_parse.token list) = > let toker = (new tokeniser tokens) in > try > parser (toker#token_src) (Lexing.from_string "dummy" ) > with _ -> > toker#report_syntax_error; > raise (Flx_exceptions.ParseError "Parsing Tokens") > > The token supplying function never looks at the lexbuf. > The parser does, to report errors, so I have to trash > the parser exceptions, since the locations are wrong. The token supplying function is supposed to _update_ the lexbuf, if you want the parser to report the correct locations. ocamllex does some of the work by updating the char count, the rest is up to the lexer itself. -- Damien ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Caml-list] Lexing.lexeme_start_p broken? 2004-09-21 8:26 ` Damien Doligez @ 2004-09-21 9:25 ` skaller 0 siblings, 0 replies; 5+ messages in thread From: skaller @ 2004-09-21 9:25 UTC (permalink / raw) To: Damien Doligez; +Cc: caml-list On Tue, 2004-09-21 at 18:26, Damien Doligez wrote: > On Sep 20, 2004, at 16:44, skaller wrote: > The token supplying function is supposed to _update_ the lexbuf, if > you want the parser to report the correct locations. ocamllex does > some of the work by updating the char count, Not in my case it doesn't. The lexing function isn't an ocamllex lexer. So I'd have to update the char count too. I actually am using ocamllex, but I drive it manually to collect a token list, then feed the list to the parser. Hmmm.. OK, how is this for an idea: Suppose we add to the lexbuf a mutable field of type: lexbuf -> loc which returns the location information the parser needs given a lexbuf. The parser then fetches the location information by calling this function on the lexbuf from which it was obtained. I can then provide a function which accepts my own state object and curry it. This way, I don't have to keep updating the lexbuf, and the parser cannot see the lexbuf details. My routine might be expensive -- but it only gets called once when there is a parse error, not every token. Whilst I don't think this is a perfect solution, it does seem to partially decouple the parser from the lexbuf by at least abstracting it using a function. Would this interfere with the Ocaml bootstrap? *** a better solution might be to pass this function directly the the parser, thereby decoupling it entirely from the lexer. However that changes the type of parser functions. That can easily be fixed though -- just make a compatibility wrapper which calls the full parser function, passing a default function value. If there was any interest, I could probably provide a design which provided proper decoupling, whilst retaining compatibility using wrappers and defaults. [But I imagine it could also be done easily by someone on the Ocaml team -- and throw in a user state object at the same time please, as has been done for the lexer :] The parser does need to get tokens, and it may need location information, but it should not depend on an object whose principle purpose is to support lexing. In theory this is also true for generated lexers: they shouldn't depend on any lexbufs. However for performance reasons, abstracting a character source probably isn't tolerable. -- John Skaller, mailto:skaller@users.sf.net voice: 061-2-9660-0850, snail: PO BOX 401 Glebe NSW 2037 Australia Checkout the Felix programming language http://felix.sf.net ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2004-09-21 9:25 UTC | newest] Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2004-09-17 22:07 [Caml-list] Lexing.lexeme_start_p broken? Scott Duckworth 2004-09-20 9:23 ` Jean-Christophe Filliatre 2004-09-20 14:44 ` skaller 2004-09-21 8:26 ` Damien Doligez 2004-09-21 9:25 ` skaller
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox