From: Pietro Abate <Pietro.Abate@pps.jussieu.fr>
To: caml-list@yquem.inria.fr
Subject: camlp4 and lexers
Date: Thu, 15 May 2008 17:00:33 +0200 [thread overview]
Message-ID: <20080515150033.GA31934@uranium.pps.jussieu.fr> (raw)
Hi all,
This question was asked a few weeks ago, and again last week. However I
still don't really get how to proceed. I hope we can cook down a small
example to understand a bit more the camlp4 internals.
Say I want to write a small parser for regexp (or an aritmetic
calculator), but I don't want to extend the ocaml grammar to do that. I
just want to create a minimal lexer and a minimal grammar to parse
expressions like (aaa*|b?);c
The parser part is easy (below). The part I don't understand is how to
create a lexer. I had a look at the ocsigen xmlcaml lexer and the camlp4
lexer, but I still haven't found a minimal example I can use without
getting confused.
In particular, the problem below is that I want my lexer to give me back
CHAR tokens (different from the CHAR of char * string of camlp4) and not
strings. I could do the same with the camlp4 lexer, but all my regexp
should be then written as ('a''a''a' *) etc ... that it's not good
looking.
A while ago I did something similar with the old camlp4 [1] using
plexer, but this is not possible anymore...
Nicolas a while ago suggested to copy the Camlp4.PreCast module and the
lexer module and customize them. I think it should be possible just
to use Struct.Grammar.Static.Make with a new lexer instead... but, as I
said, I'm not able to write a very minimal lexer for this example...
Maybe I'm confused about this.
I think a minimal example will help more then one person here.
thanks :)
p
-------------------------- This is my parser...
module RegExGram = Struct.Grammar.Static.Make(RegExpLexer)
let regex = RegExGram.Entry.mk "regex"
EXTEND RegExGram
GLOBAL: regex;
regex: [[ e1 = SELF ; "|" ; e2 = concat -> Alt(e1,e2)
| e1 = seq -> e1 ]
];
concat:[[ e1 = SELF ; ";"; e2 = seq -> Seq(e1,e2)
| e1 = SELF ; e2 = seq -> Seq(e1,e2)
| e1 = seq -> e1 ]
];
seq: [[ e1 = simple ; "?" -> Opt e1
| e1 = simple ; "*" -> Star e1
| e1 = simple ; "+" -> Plus e1
| e1 = simple -> e1 ]
];
simple:[[ "." -> Dot
| "("; e1 = regex; ")" -> e1
| `CHAR(s) -> Sym s ]
];
END
----------------------
[1] http://groups.google.com/group/fa.caml/browse_thread/thread/e26569427cc8879d
next reply other threads:[~2008-05-15 15:01 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-05-15 15:00 Pietro Abate [this message]
2008-05-16 15:24 ` [Caml-list] " Pietro Abate
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20080515150033.GA31934@uranium.pps.jussieu.fr \
--to=pietro.abate@pps.jussieu.fr \
--cc=caml-list@yquem.inria.fr \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox