From: Jon Harrop <jon@ffconsultancy.com>
To: caml-list@yquem.inria.fr
Subject: Re: [Caml-list] ocamllex+ocamlyacc and not parsing properly
Date: Mon, 8 Aug 2005 00:58:11 +0100 [thread overview]
Message-ID: <200508080058.12357.jon@ffconsultancy.com> (raw)
In-Reply-To: <ad8cfe7e050807143962166f9@mail.gmail.com>
On Sunday 07 August 2005 22:39, Jonathan Roewen wrote:
> I'm having some trouble with a lexer+parser I've written to parse IRC
> strings. Just about all strings are parsed correctly, but I'm having a
> few minor issues.
>
> Here are two strings that fail to parse correctly:
> :Sovereign.Wyldryde.org 254 dst 112 :holodeck programs running
> :
> :Sovereign.Wyldryde.org 333 dst #bfos Helio 112025589
I just added "irc_types.ml":
type command = JOIN | PART | MODE | TOPIC | NAMES | LIST | INVITE
| KICK | PRIVMSG | NOTICE | QUIT | PING | Numeric of int
and compiled with:
ocamllex irc_lexer.mll
ocamlyacc irc_parser.mly
ocamlc -c irc_types.ml irc_parser.mli irc_parser.ml irc_lexer.ml
ocamlmktop irc_types.cmo irc_parser.cmo irc_lexer.cmo -o irc.top
ran the custom top-level with "./irc.top" and asked it to lex the first of
your example strings:
# let lexbuf = Lexing.from_string ":Sovereign.Wyldryde.org 254 dst
112:holodeck programs running";;
val lexbuf : Lexing.lexbuf =
{Lexing.refill_buff = <fun>;
Lexing.lex_buffer =
":Sovereign.Wyldryde.org 254 dst 112 :holodeck programs running";
Lexing.lex_buffer_len = 62; Lexing.lex_abs_pos = 0;
Lexing.lex_start_pos = 0; Lexing.lex_curr_pos = 0;
Lexing.lex_last_pos = 0; Lexing.lex_last_action = 0;
Lexing.lex_eof_reached = true; Lexing.lex_mem = [||];
Lexing.lex_start_p =
{Lexing.pos_fname = ""; Lexing.pos_lnum = 1; Lexing.pos_bol = 0;
Lexing.pos_cnum = 0};
Lexing.lex_curr_p =
{Lexing.pos_fname = ""; Lexing.pos_lnum = 1; Lexing.pos_bol = 0;
Lexing.pos_cnum = 0}}
# Irc_lexer.message lexbuf;;
- : Irc_parser.token = Irc_parser.STRING "Sovereign.Wyldryde.org"
# Irc_lexer.message lexbuf;;
- : Irc_parser.token = Irc_parser.COMMAND (Irc_types.Numeric 254)
# Irc_lexer.message lexbuf;;
- : Irc_parser.token = Irc_parser.STRING "dst"
# Irc_lexer.message lexbuf;;
- : Irc_parser.token = Irc_parser.COMMAND (Irc_types.Numeric 112)
# Irc_lexer.message lexbuf;;
- : Irc_parser.token = Irc_parser.STRING "holodeck programs running"
# Irc_lexer.message lexbuf;;
- : Irc_parser.token = Irc_parser.EOL
So you're lexer is emitting the tokens str, com, str, com, str, eol but your
parser looks as though it is expecting str, com, str, str, str, eol.
I'm guessing the error is in the lexer because the grammar in the parser is
very simple. So ":Sovereign.Wyldryde.org" is lexed by "message" into str, " "
then invokes "command" which parses 254 into com, " " then invokes "param"
which parses "dst" into str, "param" then invokes the remaining into strs.
However, that can't be correct because the lexer has clearly gone back into
"command" in order to emit "Irc_types.Numeric 112".
It's just a guess, but have you assumed that each time the lexer is invoked by
the parser that it starts in the rule it was left in when, in fact, the
parser invokes the "message" rule every time?
> BTW: As an aside, if the lexer doesn't cover all the bases, it doesn't
> throw an exception, just screws up my OS (Bounds check error, followed
> by seg-fault).
Any idea what is causing the segfault?
--
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
Objective CAML for Scientists
http://www.ffconsultancy.com/products/ocaml_for_scientists
next prev parent reply other threads:[~2005-08-08 0:03 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2005-08-07 21:39 Jonathan Roewen
2005-08-07 21:54 ` Jonathan Roewen
2005-08-07 23:58 ` Jon Harrop [this message]
2005-08-08 2:17 ` Jonathan Roewen
2005-08-08 4:23 ` Jonathan Roewen
2005-08-08 5:03 ` Jonathan Roewen
2005-08-08 6:39 ` Jon Harrop
2005-08-08 6:47 ` Jonathan Roewen
2005-08-08 8:59 ` skaller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=200508080058.12357.jon@ffconsultancy.com \
--to=jon@ffconsultancy.com \
--cc=caml-list@yquem.inria.fr \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox