From mboxrd@z Thu Jan  1 00:00:00 1970
Received: (from weis@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id QAA21470 for caml-redistribution; Mon, 2 Jun 1997 16:49:30 +0200 (MET DST)
Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id JAA15566 for <caml-list@pauillac.inria.fr>; Mon, 2 Jun 1997 09:36:12 +0200 (MET DST)
Received: from haven.uchicago.edu (root@haven.uchicago.edu [128.135.12.3]) by nez-perce.inria.fr (8.8.5/8.7.3) with ESMTP id BAA29870 for <caml-list@inria.fr>; Mon, 2 Jun 1997 01:54:34 +0200 (MET DST)
Received: from midway.uchicago.edu (root@midway.uchicago.edu [128.135.12.12])
	by haven.uchicago.edu (8.8.5/8.8.5) with ESMTP id SAA18933
	for <caml-list@inria.fr>; Sun, 1 Jun 1997 18:54:30 -0500 (CDT)
Received: from kimbark.uchicago.edu (root@kimbark.uchicago.edu [128.135.12.52]) by midway.uchicago.edu (8.8.5/8.8.3) with ESMTP id SAA18114 for <caml-list@inria.fr>; Sun, 1 Jun 1997 18:53:14 -0500 (CDT)
Received: from kimbark.uchicago.edu (4208@localhost [127.0.0.1]) by kimbark.uchicago.edu (8.8.5/8.8.3) with ESMTP id SAA06495 for <caml-list@inria.fr>; Sun, 1 Jun 1997 18:53:13 -0500 (CDT)
Message-Id: <199706012353.SAA06495@kimbark.uchicago.edu>
To: caml-list@inria.fr
Subject: lexing strings
Date: Sun, 01 Jun 1997 18:53:12 -0500
From: Lyn A Headley <laheadle@midway.uchicago.edu>
Sender: weis

hi,

I pored over the flex/bison manuals without finding an answer,
so I hope this question is nontrivial.

I'm having a rough time lexing strings with ocamllex, just using
the normal read-eval-print interpreter whose main grammar rule is:

expr EOL                { $1 }

with one of the 'expr' rules like this:
| STRING                  { (O.String $1) }


My lex file has a rule like this:

| '\''           { slurp lexbuf }

recursively lexing strings according to rule 'slurp'. slurp's main
regex looks like this:

[^'\n']*[^'\\']'\''

which should match any sequence of non-newlines until it reaches a '
not preceded by a backslash.  slurp returns the token: STRING(!build)).

My intent, when reading a string, is for the lexer to see the first ',
jump into 'slurp,' eat up the string and return it as the STRING token,
then have the parser read a newline and return EOL, thus matching the
main grammar rule and printing the result.  This almost works, but not
until the user types _two_ newlines will the "interpreter" respond
by printing the expression value! i.e., typing

'hi' [newline]

at the prompt is not enough; two newlines are required.  Other than
that, the expected value is returned.  Does this mean that the first
newline is interpreted as part of the STRING?  Why would my regex match
the newline?

any help appreciated,

Lyn Headley