From mboxrd@z Thu Jan 1 00:00:00 1970 Received: (from weis@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id QAA21470 for caml-redistribution; Mon, 2 Jun 1997 16:49:30 +0200 (MET DST) Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id JAA15566 for ; Mon, 2 Jun 1997 09:36:12 +0200 (MET DST) Received: from haven.uchicago.edu (root@haven.uchicago.edu [128.135.12.3]) by nez-perce.inria.fr (8.8.5/8.7.3) with ESMTP id BAA29870 for ; Mon, 2 Jun 1997 01:54:34 +0200 (MET DST) Received: from midway.uchicago.edu (root@midway.uchicago.edu [128.135.12.12]) by haven.uchicago.edu (8.8.5/8.8.5) with ESMTP id SAA18933 for ; Sun, 1 Jun 1997 18:54:30 -0500 (CDT) Received: from kimbark.uchicago.edu (root@kimbark.uchicago.edu [128.135.12.52]) by midway.uchicago.edu (8.8.5/8.8.3) with ESMTP id SAA18114 for ; Sun, 1 Jun 1997 18:53:14 -0500 (CDT) Received: from kimbark.uchicago.edu (4208@localhost [127.0.0.1]) by kimbark.uchicago.edu (8.8.5/8.8.3) with ESMTP id SAA06495 for ; Sun, 1 Jun 1997 18:53:13 -0500 (CDT) Message-Id: <199706012353.SAA06495@kimbark.uchicago.edu> To: caml-list@inria.fr Subject: lexing strings Date: Sun, 01 Jun 1997 18:53:12 -0500 From: Lyn A Headley Sender: weis hi, I pored over the flex/bison manuals without finding an answer, so I hope this question is nontrivial. I'm having a rough time lexing strings with ocamllex, just using the normal read-eval-print interpreter whose main grammar rule is: expr EOL { $1 } with one of the 'expr' rules like this: | STRING { (O.String $1) } My lex file has a rule like this: | '\'' { slurp lexbuf } recursively lexing strings according to rule 'slurp'. slurp's main regex looks like this: [^'\n']*[^'\\']'\'' which should match any sequence of non-newlines until it reaches a ' not preceded by a backslash. slurp returns the token: STRING(!build)). My intent, when reading a string, is for the lexer to see the first ', jump into 'slurp,' eat up the string and return it as the STRING token, then have the parser read a newline and return EOL, thus matching the main grammar rule and printing the result. This almost works, but not until the user types _two_ newlines will the "interpreter" respond by printing the expression value! i.e., typing 'hi' [newline] at the prompt is not enough; two newlines are required. Other than that, the expected value is returned. Does this mean that the first newline is interpreted as part of the STRING? Why would my regex match the newline? any help appreciated, Lyn Headley