From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on yquem.inria.fr X-Spam-Level: X-Spam-Status: No, score=0.0 required=5.0 tests=none autolearn=disabled version=3.1.3 Received: from discorde.inria.fr (discorde.inria.fr [192.93.2.38]) by yquem.inria.fr (Postfix) with ESMTP id 571CDBC69 for ; Sun, 5 Aug 2007 06:52:43 +0200 (CEST) Received: from smtp.syd.people.net.au (smtp.syd.people.net.au [218.214.225.98]) by discorde.inria.fr (8.13.6/8.13.6) with SMTP id l754qeAS004265 for ; Sun, 5 Aug 2007 06:52:42 +0200 Received: (qmail 10053 invoked from network); 5 Aug 2007 04:52:48 -0000 Received: from unknown (HELO hendrix.mega-nerd.net) (218.214.64.136) by smtp.syd.people.net.au with SMTP; 5 Aug 2007 04:52:48 -0000 Received: from hendrix (hendrix [192.168.200.99]) by hendrix.mega-nerd.net (Postfix) with SMTP id F1BD2300F0 for ; Sun, 5 Aug 2007 14:52:38 +1000 (EST) Date: Sun, 5 Aug 2007 14:52:38 +1000 From: Erik de Castro Lopo To: caml-list@inria.fr Subject: Re: [Caml-list] Handling include files using ocamllex Message-Id: <20070805145238.76aed5d6.mle+ocaml@mega-nerd.com> In-Reply-To: <1186060770.23889.39.camel@rosella.wigram> References: <20070802200933.3621fb1a.mle+ocaml@mega-nerd.com> <1186060770.23889.39.camel@rosella.wigram> Organization: Erik Conspiracy Secret Labs X-Mailer: Sylpheed 2.3.1 (GTK+ 2.10.11; x86_64-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Miltered: at discorde with ID 46B55798.001 by Joe's j-chkmail (http://j-chkmail . ensmp . fr)! X-Spam: no; 0.00; ocaml:01 ocamllex:01 lexbuf:01 parser:01 lexbuf:01 ocamlyacc:01 ocaml:01 mutable:01 lexing:01 mutable:01 lexing:01 lexer:01 lexer:01 ocamlyacc:01 parser:01 skaller wrote: > I recommend abandoning the idea of passing a > lexbuf to a parser: make a dummy lexbuf and pass that to > keep Ocamlyacc happy, but make sure you never use it. > > Instead, create an Ocaml class with a get_token method, > and use the closure of that method over the class PLUS > a dummy lexbuf. I tried that with a class and ran into all sorts of problems related to trying to use instance data in the constructor. In the end, I ditched the class/object but kept your idea and approached it from a more functional direction which resulted in this (filename lexstack.ml): ------------------8<------------------8<------------------ (* The Lexstack type. *) type 'a t = { mutable stack : (string * in_channel * Lexing.lexbuf) list ; mutable filename : string ; mutable chan : in_channel ; mutable lexbuf : Lexing.lexbuf ; lexfunc : Lexing.lexbuf -> 'a ; } (* ** Create a lexstack with an initial top level filename and the ** lexer function. *) let create top_filename lexer_function = let chan = open_in top_filename in { stack = [] ; filename = top_filename ; chan = chan ; lexbuf = Lexing.from_channel chan ; lexfunc = lexer_function } (* ** The the next token. Need to accept an unused dummy lexbuf so that ** a closure consisting of the function and a lexstack can be passed ** to the ocamlyacc generated parser. *) let rec get_token ls dummy_lexbuf = match ls.lexfunc ls.lexbuf with | Parser.TOK_INCLUDE fname -> ls.stack <- (ls.filename, ls.chan, ls.lexbuf) :: ls.stack ; ls.filename <- fname ; ls.chan <- open_in fname ; ls.lexbuf <- Lexing.from_channel ls.chan ; get_token ls dummy_lexbuf | Parser.TOK_EOF -> ( match ls.stack with | [] -> Parser.TOK_EOF | (fn, ch, lb) :: tail -> ls.filename <- fn ; ls.chan <- ch ; ls.stack <- tail ; get_token ls dummy_lexbuf ) | anything -> anything (* Get the current lexeme. *) let lexeme ls = Lexing.lexeme ls.lexbuf (* Get filename, line number and column number of current lexeme. *) let current_pos ls = let pos = Lexing.lexeme_end_p ls.lexbuf in let linepos = pos.Lexing.pos_cnum - pos.Lexing.pos_bol - String.length (Lexing.lexeme ls.lexbuf) in ls.filename, pos.Lexing.pos_lnum, linepos ------------------8<------------------8<------------------ This can then be used like this: let lexstack = Lexstack.create filename Scanner.tokenizer in let dummy_lexbuf = Lexing.from_string "" in try Parser.parse (Lexstack.get_token lexstack) dummy_lexbuf with | Scanner.Lexical_error s -> raise (E s) | Parsing.Parse_error -> let fname, lnum, lpos = Lexstack.current_pos lexstack in let errstr = Printf.sprintf "\n\nFile '%s' line %d, column %d : current token is '%s'.\n" fname lnum lpos (Lexstack.lexeme lexstack) in raise (E errstr) I haven't tested it as thoroughly as I should have, but the general idea seems to work. Hopefully this will get hooked up into the Ocaml Weekly News and then indexed by Google so other people who run into this problem can find this solution. Cheers, Erik -- ----------------------------------------------------------------- Erik de Castro Lopo ----------------------------------------------------------------- "Copyrighting allows people to benefit from their labours, but software patents allow the companies with the largest legal departments to benefit from everyone else's work." -- Andrew Brown (http://www.guardian.co.uk/online/comment/story/0,12449,1387575,00.html)