From mboxrd@z Thu Jan 1 00:00:00 1970 Received: (from majordomo@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id UAA29587; Mon, 28 Apr 2003 20:13:58 +0200 (MET DST) X-Authentication-Warning: pauillac.inria.fr: majordomo set sender to owner-caml-list@pauillac.inria.fr using -f Received: from concorde.inria.fr (concorde.inria.fr [192.93.2.39]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id UAA29845 for ; Mon, 28 Apr 2003 20:13:54 +0200 (MET DST) Received: from jamlikhet.polytechnique.org (b.mx.m4x.polytechnique.org [129.104.30.35]) by concorde.inria.fr (8.11.1/8.11.1) with ESMTP id h3SIDsH14372 for ; Mon, 28 Apr 2003 20:13:54 +0200 (MET DST) Received: from alan-schm1p.inria.fr (DHCP12-12.CIS.UPENN.EDU [158.130.13.42]) by ssl.polytechnique.org (Postfix) with ESMTP id 496EFB41E for ; Mon, 28 Apr 2003 20:13:53 +0200 (CEST) Received: by alan-schm1p.inria.fr (Postfix, from userid 501) id 6633B4044; Mon, 28 Apr 2003 14:13:48 -0400 (EDT) Date: Mon, 28 Apr 2003 14:13:48 -0400 From: Alan Schmitt To: caml-list@inria.fr Subject: [Caml-list] a few lexing questions Message-ID: <20030428181348.GM9337@alan-schm1p> Mail-Followup-To: caml-list@inria.fr Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.4i X-Editor: Vim http://vim.sf.net/ X-Info: http://pauillac.inria.fr/~aschmitt/ X-Operating-System: Linux/2.4.21pre4-1mdk (i686) X-Uptime: 14:04:41 up 15:21, 1 user, load average: 0.02, 0.03, 0.00 X-Spam: no; 0.00; reuse:01 symmetric:01 parser:02 bytes:02 buggy:03 lexbuf:03 lex:04 merged:04 parse:04 merging:05 implement:05 suggestion:06 i'd:06 'x':07 letter:92 Sender: owner-caml-list@pauillac.inria.fr Precedence: bulk Hi, I am writing a parser / pretty-printer for the iCalendar file format, and I have a few questions. First of all, there are some times when I need to differentiate lexing according to the first letter of what I am lexing. As I want to reuse the lexing code, I did the following: and alt_lang_xparam = parse | 'A' { lexbuf.Lexing.lex_curr_pos <- Lexing.lexeme_start lexbuf; Altrepparam (altrepparam lexbuf) } | 'X' { lexbuf.Lexing.lex_curr_pos <- Lexing.lexeme_start lexbuf; Xparam (xparam lexbuf) } | 'L' { lexbuf.Lexing.lex_curr_pos <- Lexing.lexeme_start lexbuf; Langparam (languageparam lexbuf) } Is it buggy ? Bad style ? Is there a nicer way to do it ? A second question is about integration this code with other lexing code or streams. An iCalendar file cannot have a line that is longer than 75 bytes, excluding line break. A line may be broken anywhere as long as there is a space at the beginning of the next line, as in: this is a very lo ng line represents "this is a very long line". As this break may occur anywhere (even inside keywords), I assume when writing the lexer these kind of lines have been already merged together. I know how to implement the merging using a temp file, but I'm looking for a nicer solution (like using a stream, or using one lexer to feed the current lexer). Any suggestion ? I'd also like to have some advice on doing the symmetric transformation (breaking long lines when necessary) ... Would streams be a good solution here ? Thanks a lot, Alan Schmitt -- The hacker: someone who figured things out and made something cool happen. ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners