* camlp4 stream parser syntax
@ 2009-03-07 22:38 Joel Reymont
2009-03-07 22:52 ` Joel Reymont
2009-03-07 23:52 ` [Caml-list] " Jon Harrop
0 siblings, 2 replies; 36+ messages in thread
From: Joel Reymont @ 2009-03-07 22:38 UTC (permalink / raw)
To: O'Caml Mailing List
Where can I read up on the syntax of the following in a camlp4 stream
parser?
| [<' INT n >] -> Int n
For example, where are [< ... >] described and why is the ' needed in
between?
Thanks, Joel
---
http://tinyco.de
Mac, C++, OCaml
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: camlp4 stream parser syntax
2009-03-07 22:38 camlp4 stream parser syntax Joel Reymont
@ 2009-03-07 22:52 ` Joel Reymont
2009-03-07 23:21 ` Re : [Caml-list] " Matthieu Wipliez
2009-03-07 23:52 ` [Caml-list] " Jon Harrop
1 sibling, 1 reply; 36+ messages in thread
From: Joel Reymont @ 2009-03-07 22:52 UTC (permalink / raw)
To: O'Caml Mailing List
> Where can I read up on the syntax of the following in a camlp4
> stream parser?
>
> | [<' INT n >] -> Int n
>
> For example, where are [< ... >] described and why is the ' needed
> in between?
To be more precise, I'm using camlp4 to parse a language into a non-
OCaml AST.
I'm trying to figure out the meaning of [<, >], [[ and ]]
My ocamllex lexer is wrapped to make it look like a stream lexer
(below) and I'm returning a tuple of (tok, loc) because I don't see
another way of making token location available to the parser.
Still, I'm how to integrate the reporting of error location into ?? in
something like this
| [< 'Token.Kwd '('; e=parse_expr; 'Token.Kwd ')' ?? "expected ')'"
>] -> e
Would someone kindly shed light on this?
Thanks in advance, Joel
P.S. ocamllex wrapper to return a' Stream.t
{
let from_lexbuf tab lb =
let next _ =
let tok = token tab lb in
let loc = Loc.of_lexbuf lb in
Some (tok, loc)
in Stream.from next
let setup_loc lb loc =
let start_pos = Loc.start_pos loc in
lb.lex_abs_pos <- start_pos.pos_cnum;
lb.lex_curr_p <- start_pos
let from_string loc tab str =
let lb = Lexing.from_string str in
setup_loc lb loc;
from_lexbuf tab lb
}
---
http://tinyco.de
Mac, C++, OCaml
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re : [Caml-list] Re: camlp4 stream parser syntax
2009-03-07 22:52 ` Joel Reymont
@ 2009-03-07 23:21 ` Matthieu Wipliez
2009-03-07 23:42 ` Joel Reymont
2009-03-08 0:40 ` Joel Reymont
0 siblings, 2 replies; 36+ messages in thread
From: Matthieu Wipliez @ 2009-03-07 23:21 UTC (permalink / raw)
To: O'Caml Mailing List
[-- Attachment #1: Type: text/plain, Size: 3737 bytes --]
Hi Joel,
why are you using stream parsers instead of Camlp4 grammars ?
This:
> let rec parse_primary = parser
>
> | [< 'INT n >] -> Int n
> | [< 'FLOAT n >] -> Float n
> | [< 'STRING n >] -> Str n
> | [< 'TRUE >] -> Bool true
> | [< 'FALSE >] -> Bool false
>
> | [< >] -> raise (Stream.Error "unknown token when expecting an expression.")
could be written as:
expression: [
[ (i, _) = INT -> Int i
| (s, _) = STRING -> Str s
... ]
];
Note that Camlp4 will automatically raise an exception if the input cannot be parsed with the grammar given.
Also if you have input that is syntactically correct but is not semantically correct, and you want to raise an exception with the error location during parsing, you might want to use Loc.raise as follows:
expression: [
[ e1 = SELF; "/"; e2 = SELF ->
if e2 = Int 0 then
Loc.raise _loc (Failure "division by zero")
else
BinaryOp (e1, Div, e2) ]
];
By the way, do you need you own tailor-made lexer? Camlp4 provides one that might satisfy your needs.
Otherwise, you can always define your own lexer (I had to do that for the project I'm working on, see file attached).
Your parser would then look like
(* functor application *)
module Camlp4Loc = Camlp4.Struct.Loc
module Lexer = Cal_lexer.Make(Camlp4Loc)
module Gram = Camlp4.Struct.Grammar.Static.Make(Lexer)
(* exposes EOI and other stuff *)
open Lexer
(* rule definition *)
let rule = Gram.Entry.mk "rule"
(* grammar definition *)
EXTEND Gram
rule: [ [ ... ] ];
END
(* to parse a file *)
Gram.parse rule (Loc.mk file) (Stream.of_channel ch)
This should be compiled with camlp4of.
I hope this helps you with what you'd like to do,
Cheers,
Matthieu
----- Message d'origine ----
> De : Joel Reymont <joelr1@gmail.com>
> À : O'Caml Mailing List <caml-list@yquem.inria.fr>
> Envoyé le : Samedi, 7 Mars 2009, 23h52mn 52s
> Objet : [Caml-list] Re: camlp4 stream parser syntax
>
> > Where can I read up on the syntax of the following in a camlp4 stream parser?
> >
> > | [<' INT n >] -> Int n
> >
> > For example, where are [< ... >] described and why is the ' needed in between?
>
>
> To be more precise, I'm using camlp4 to parse a language into a non-OCaml AST.
>
> I'm trying to figure out the meaning of [<, >], [[ and ]]
>
> My ocamllex lexer is wrapped to make it look like a stream lexer (below) and I'm
> returning a tuple of (tok, loc) because I don't see another way of making token
> location available to the parser.
>
> Still, I'm how to integrate the reporting of error location into ?? in something
> like this
>
> | [< 'Token.Kwd '('; e=parse_expr; 'Token.Kwd ')' ?? "expected ')'" >] -> e
>
> Would someone kindly shed light on this?
>
> Thanks in advance, Joel
>
> P.S. ocamllex wrapper to return a' Stream.t
>
> {
> let from_lexbuf tab lb =
> let next _ =
> let tok = token tab lb in
> let loc = Loc.of_lexbuf lb in
> Some (tok, loc)
> in Stream.from next
>
> let setup_loc lb loc =
> let start_pos = Loc.start_pos loc in
> lb.lex_abs_pos <- start_pos.pos_cnum;
> lb.lex_curr_p <- start_pos
>
> let from_string loc tab str =
> let lb = Lexing.from_string str in
> setup_loc lb loc;
> from_lexbuf tab lb
>
> }
>
> ---
> http://tinyco.de
> Mac, C++, OCaml
>
>
>
> _______________________________________________
> Caml-list mailing list. Subscription management:
> http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
> Archives: http://caml.inria.fr
> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
> Bug reports: http://caml.inria.fr/bin/caml-bugs
[-- Attachment #2: cal_lexer.mll --]
[-- Type: application/octet-stream, Size: 10564 bytes --]
(*****************************************************************************)
(* Cal2C *)
(* Copyright (c) 2007-2008, IETR/INSA of Rennes. *)
(* All rights reserved. *)
(* *)
(* This software is governed by the CeCILL-B license under French law and *)
(* abiding by the rules of distribution of free software. You can use, *)
(* modify and/ or redistribute the software under the terms of the CeCILL-B *)
(* license as circulated by CEA, CNRS and INRIA at the following URL *)
(* "http://www.cecill.info". *)
(* *)
(* Matthieu WIPLIEZ <Matthieu.Wipliez@insa-rennes.fr *)
(*****************************************************************************)
(* File cal_lexer.mll *)
{
open Printf
open Format
module Make (Loc : Camlp4.Sig.Loc) = struct
module Loc = Loc
type token =
| KEYWORD of string
| SYMBOL of string
| IDENT of string
| INT of int * string
| FLOAT of float * string
| CHAR of char * string
| STRING of string * string
| EOI
module Token = struct
module Loc = Loc
type t = token
let to_string =
function
KEYWORD s -> sprintf "KEYWORD %S" s
| SYMBOL s -> sprintf "SYMBOL %S" s
| IDENT s -> sprintf "IDENT %S" s
| INT (_, s) -> sprintf "INT %s" s
| FLOAT (_, s) -> sprintf "FLOAT %s" s
| CHAR (_, s) -> sprintf "CHAR '%s'" s
| STRING (_, s) -> sprintf "STRING \"%s\"" s
(* here it's not %S since the string is already escaped *)
| EOI -> sprintf "EOI"
let print ppf x = pp_print_string ppf (to_string x)
let match_keyword kwd = function
KEYWORD kwd' when kwd = kwd' -> true
| _ -> false
let extract_string =
function
KEYWORD s
| IDENT s
| INT (_, s)
| FLOAT (_, s)
| CHAR (_, s)
| STRING (_, s) -> s
| tok ->
invalid_arg ("Cannot extract a string from this token: "^
to_string tok)
module Error = struct
type t =
Illegal_token of string
| Keyword_as_label of string
| Illegal_token_pattern of string * string
| Illegal_constructor of string
exception E of t
let print ppf =
function
Illegal_token s ->
fprintf ppf "Illegal token (%s)" s
| Keyword_as_label kwd ->
fprintf ppf "`%s' is a keyword, it cannot be used as label name" kwd
| Illegal_token_pattern (p_con, p_prm) ->
fprintf ppf "Illegal token pattern: %s %S" p_con p_prm
| Illegal_constructor con ->
fprintf ppf "Illegal constructor %S" con
let to_string x =
let b = Buffer.create 50 in
let () = bprintf b "%a" print x in Buffer.contents b
end
module M = Camlp4.ErrorHandler.Register(Error)
module Filter = struct
type token_filter = (t, Loc.t) Camlp4.Sig.stream_filter
type t =
{ is_kwd : string -> bool;
mutable filter : token_filter }
let mk is_kwd =
{ is_kwd = is_kwd;
filter = fun s -> s }
let keyword_conversion tok is_kwd =
match tok with
SYMBOL s | IDENT s when is_kwd s -> KEYWORD s
| _ -> tok
let filter x =
let f tok loc =
let tok' = keyword_conversion tok x.is_kwd in
(tok', loc)
in
let rec filter =
parser
| [< '(tok, loc); s >] -> [< ' f tok loc; filter s >]
| [< >] -> [< >]
in
fun strm -> x.filter (filter strm)
let define_filter x f = x.filter <- f x.filter
let keyword_added _ _ _ = ()
let keyword_removed _ _ = ()
end
end
open Lexing
(* Error report *)
module Error = struct
type t =
| Illegal_character of char
| Illegal_escape of string
| Unterminated_comment
| Unterminated_string
| Unterminated_quotation
| Unterminated_antiquot
| Unterminated_string_in_comment
| Comment_start
| Comment_not_end
| Literal_overflow of string
exception E of t
open Format
let print ppf =
function
| Illegal_character c ->
fprintf ppf "Illegal character (%s)" (Char.escaped c)
| Illegal_escape s ->
fprintf ppf "Illegal backslash escape in string or character (%s)" s
| Unterminated_comment ->
fprintf ppf "Comment not terminated"
| Unterminated_string ->
fprintf ppf "String literal not terminated"
| Unterminated_string_in_comment ->
fprintf ppf "This comment contains an unterminated string literal"
| Unterminated_quotation ->
fprintf ppf "Quotation not terminated"
| Unterminated_antiquot ->
fprintf ppf "Antiquotation not terminated"
| Literal_overflow ty ->
fprintf ppf "Integer literal exceeds the range of representable integers of type %s" ty
| Comment_start ->
fprintf ppf "this is the start of a comment"
| Comment_not_end ->
fprintf ppf "this is not the end of a comment"
let to_string x =
let b = Buffer.create 50 in
let () = bprintf b "%a" print x in Buffer.contents b
end
let module M = Camlp4.ErrorHandler.Register(Error) in ()
open Error
open Cal2c_util
exception Eof
(* String construction *)
let str = ref ""
type context = {
loc : Loc.t;
in_comment : bool;
quotations : bool;
antiquots : bool;
lexbuf : lexbuf;
buffer : Buffer.t
}
(* Update the current location with file name and line number. *)
let update_loc c file line absolute chars =
let lexbuf = c.lexbuf in
let pos = lexbuf.lex_curr_p in
let new_file =
match file with
| None -> pos.pos_fname
| Some s -> s
in
lexbuf.lex_curr_p <- { pos with
pos_fname = new_file;
pos_lnum = if absolute then line else pos.pos_lnum + line;
pos_bol = pos.pos_cnum - chars;
}
(* Matches either \ or $. Why so many backslashes? Because \ has to be escaped*)
(* in strings, so we get \\. \, | and $ also have to be escaped in regexps, *)
(* so we have \\\\ \\| \\$. *)
let re_id = Str.regexp "\\\\\\|\\$"
}
(* Numbers *)
let nonZeroDecimalDigit = ['1'-'9']
let decimalDigit = '0' | nonZeroDecimalDigit
let decimalLiteral = nonZeroDecimalDigit (decimalDigit)*
let hexadecimalDigit = decimalDigit | ['a'-'f'] | ['A'-'F']
let hexadecimalLiteral = '0' ('x'|'X') hexadecimalDigit (hexadecimalDigit)*
let octalDigit = ['0'-'7']
let octalLiteral = '0' (octalDigit)*
let integer = decimalLiteral | hexadecimalLiteral | octalLiteral
let exponent = ('e'|'E') ('+'|'-')? decimalDigit+
let real = decimalDigit+ '.' (decimalDigit)* exponent?
| '.' decimalDigit+ exponent?
| decimalDigit+ exponent
(* Identifiers *)
let char = ['a'-'z' 'A'-'Z']
let any_identifier = (char | '_' | decimalDigit | '$')+
let other_identifier =
(char | '_') (char | '_' | decimalDigit | '$')*
| '$' (char | '_' | decimalDigit | '$')+
let identifier = '\\' any_identifier '\\' | other_identifier
let newline = ('\010' | '\013' | "\013\010")
(* Token rule *)
rule token c = parse
| [' ' '\t'] {token c lexbuf}
| newline { update_loc c None 1 false 0; token c lexbuf }
| "^" { SYMBOL "^" }
| "->" { SYMBOL "->" }
| ':' { SYMBOL ":" }
| ":=" { SYMBOL ":=" }
| ',' { SYMBOL "," }
| "!=" { SYMBOL "!=" }
| '/' { SYMBOL "/" }
| '.' { SYMBOL "." }
| ".." { SYMBOL ".." }
| "::" { SYMBOL "::" }
| "-->" { SYMBOL "-->" }
| "==>" { SYMBOL "==>" }
| '=' { SYMBOL "=" }
| ">=" { SYMBOL ">=" }
| '>' { SYMBOL ">" }
| '{' { SYMBOL "{" }
| '[' { SYMBOL "[" }
| "<=" { SYMBOL "<=" }
| '<' { SYMBOL "<" }
| '(' { SYMBOL "(" }
| '-' { SYMBOL "-" }
| '+' { SYMBOL "+" }
| '}' { SYMBOL "}" }
| ']' { SYMBOL "]" }
| ')' { SYMBOL ")" }
| ';' { SYMBOL ";" }
| '#' { SYMBOL "#" }
| '*' { SYMBOL "*" }
| integer as lxm { INT (int_of_string lxm, lxm) }
| real as lxm { FLOAT (float_of_string lxm, lxm) }
| identifier as ident {
let ident = Str.global_replace re_id "_" ident in
IDENT ident }
| '"' { let str = string c lexbuf in STRING (str, str) }
| "//" { single_line_comment c lexbuf }
| "/*" { multi_line_comment c lexbuf }
| eof { EOI }
and string ctx = parse
| "\\\"" { str := !str ^ "\\\""; string ctx lexbuf }
| '"' { let s = !str in str := ""; s }
| _ as c { str := !str ^ (String.make 1 c); string ctx lexbuf }
and single_line_comment c = parse
| newline { update_loc c None 1 false 0; token c lexbuf }
| _ { single_line_comment c lexbuf }
and multi_line_comment c = parse
| "*/" { token c lexbuf }
| newline { update_loc c None 1 false 0; multi_line_comment c lexbuf }
| _ { multi_line_comment c lexbuf }
{
let default_context lb =
{ loc = Loc.ghost ;
in_comment = false ;
quotations = true ;
antiquots = false ;
lexbuf = lb ;
buffer = Buffer.create 256 }
let update_loc c = { (c) with loc = Loc.of_lexbuf c.lexbuf }
let with_curr_loc f c = f (update_loc c) c.lexbuf
let lexing_store s buff max =
let rec self n s =
if n >= max then n
else
match Stream.peek s with
| Some x ->
Stream.junk s;
buff.[n] <- x;
succ n
| _ -> n
in
self 0 s
let from_context c =
let next _ =
let tok = with_curr_loc token c in
let loc = Loc.of_lexbuf c.lexbuf in
Some ((tok, loc))
in Stream.from next
let from_lexbuf ?(quotations = true) lb =
let c = { (default_context lb) with
loc = Loc.of_lexbuf lb;
antiquots = !Camlp4_config.antiquotations;
quotations = quotations }
in from_context c
let setup_loc lb loc =
let start_pos = Loc.start_pos loc in
lb.lex_abs_pos <- start_pos.pos_cnum;
lb.lex_curr_p <- start_pos
let from_string ?quotations loc str =
let lb = Lexing.from_string str in
setup_loc lb loc;
from_lexbuf ?quotations lb
let from_stream ?quotations loc strm =
let lb = Lexing.from_function (lexing_store strm) in
setup_loc lb loc;
from_lexbuf ?quotations lb
let mk () loc strm =
from_stream ~quotations:!Camlp4_config.quotations loc strm
end
}
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Re : [Caml-list] Re: camlp4 stream parser syntax
2009-03-07 23:21 ` Re : [Caml-list] " Matthieu Wipliez
@ 2009-03-07 23:42 ` Joel Reymont
2009-03-08 0:40 ` Joel Reymont
1 sibling, 0 replies; 36+ messages in thread
From: Joel Reymont @ 2009-03-07 23:42 UTC (permalink / raw)
To: Matthieu Wipliez; +Cc: O'Caml Mailing List
On Mar 7, 2009, at 11:21 PM, Matthieu Wipliez wrote:
> why are you using stream parsers instead of Camlp4 grammars ?
Because I don't know any better? I'm just starting out, really.
I have a parser that I wrote using ocamlyacc and menhir. I finally
when with dypgen and didn't touch the code for a few months. I then
tried to simplify the grammar on account of a later type checking pass
and realized that I cannot troubleshoot it.
I think I can make do with a camlp4 parser and it will vastly simplify
debugging.
> This:
> ...
> could be written as:
> expression: [
> [ (i, _) = INT -> Int i
> | (s, _) = STRING -> Str s
> ... ]
> ];
Doesn't your example assume that I'm using the camlp4 lexer?
> expression: [
> [ e1 = SELF; "/"; e2 = SELF ->
> if e2 = Int 0 then
> Loc.raise _loc (Failure "division by zero")
> else
> BinaryOp (e1, Div, e2) ]
> ];
Where does SELF above come from?
Can I use a token instead of "/" since I return SLASH whenever "/" is
found by the lexer.
> By the way, do you need you own tailor-made lexer? Camlp4 provides
> one that might satisfy your needs.
It has been said that it's not extensible so I wrote my own lexer
using ocamllex and wrapped it to return (tok, loc) Stream.t.
> Otherwise, you can always define your own lexer (I had to do that
> for the project I'm working on, see file attached).
Thanks, I'll study it.
> Your parser would then look like
>
> (* functor application *)
> module Camlp4Loc = Camlp4.Struct.Loc
> module Lexer = Cal_lexer.Make(Camlp4Loc)
> module Gram = Camlp4.Struct.Grammar.Static.Make(Lexer)
Is this extending the OCaml grammar or starting with an "empty" one?
> (* rule definition *)
> let rule = Gram.Entry.mk "rule"
Is this the "start" rule of the parser?
Thanks, Joel
---
http://tinyco.de
Mac, C++, OCaml
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [Caml-list] camlp4 stream parser syntax
2009-03-07 22:38 camlp4 stream parser syntax Joel Reymont
2009-03-07 22:52 ` Joel Reymont
@ 2009-03-07 23:52 ` Jon Harrop
2009-03-07 23:53 ` Joel Reymont
1 sibling, 1 reply; 36+ messages in thread
From: Jon Harrop @ 2009-03-07 23:52 UTC (permalink / raw)
To: caml-list
On Saturday 07 March 2009 22:38:14 Joel Reymont wrote:
> Where can I read up on the syntax of the following in a camlp4 stream
> parser?
>
> | [<' INT n >] -> Int n
>
> For example, where are [< ... >] described and why is the ' needed in
> between?
The grammar is described formally here:
http://caml.inria.fr/pub/docs/manual-camlp4/manual003.html
You may find one of my free articles on parsing to be of interest because it
covers the stream parser camlp4 extension:
http://www.ffconsultancy.com/ocaml/benefits/parsing.html
There is also a slightly bigger parser here:
http://www.ffconsultancy.com/ocaml/benefits/interpreter.html
The [< .. >] denote a stream when matching over one using the "parser" keyword
and the tick ' denotes a kind of literal to identify a single token in the
stream. So:
| [< 'Kwd "if"; p=parse_expr; 'Kwd "then"; t=parse_expr;
'Kwd "else"; f=parse_expr >] ->
uses ' to parse three individual keywords but also requests that parts of the
stream are parsed using the parse_expr function and each result is named
accordingly.
--
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?e
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [Caml-list] camlp4 stream parser syntax
2009-03-07 23:52 ` [Caml-list] " Jon Harrop
@ 2009-03-07 23:53 ` Joel Reymont
2009-03-08 0:12 ` Jon Harrop
0 siblings, 1 reply; 36+ messages in thread
From: Joel Reymont @ 2009-03-07 23:53 UTC (permalink / raw)
To: Jon Harrop; +Cc: caml-list
Jon,
On Mar 7, 2009, at 11:52 PM, Jon Harrop wrote:
> The [< .. >] denote a stream when matching over one using the
> "parser" keyword
> and the tick ' denotes a kind of literal to identify a single token
> in the
> stream. So:
>
> | [< 'Kwd "if"; p=parse_expr; 'Kwd "then"; t=parse_expr;
> 'Kwd "else"; f=parse_expr >] ->
Should I be using camlp4 grammars as Matthieu suggested?
It seems there are are far more and better resources on doing this
than the stream parsing approach. This includes your OCaml Journal.
Do I loose anything when going with camlp4 grammars and NOT parsing
into an OCaml AST? Do I gain a lot with grammars over stream parsing?
Thanks, Joel
---
http://tinyco.de
Mac, C++, OCaml
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [Caml-list] camlp4 stream parser syntax
2009-03-07 23:53 ` Joel Reymont
@ 2009-03-08 0:12 ` Jon Harrop
2009-03-08 0:20 ` Re : " Matthieu Wipliez
0 siblings, 1 reply; 36+ messages in thread
From: Jon Harrop @ 2009-03-08 0:12 UTC (permalink / raw)
To: Joel Reymont, caml-list
On Saturday 07 March 2009 23:53:03 you wrote:
> Should I be using camlp4 grammars as Matthieu suggested?
>
> It seems there are are far more and better resources on doing this
> than the stream parsing approach. This includes your OCaml Journal.
I would say that there is very little documentation about either approach but
I personally found it much easier to use the stream parsers rather than
camlp4 because they are much simpler and, therefore, do not require so much
documentation. Having said that, I never used ??.
> Do I loose anything when going with camlp4 grammars and NOT parsing
> into an OCaml AST?
No, parsing into other ASTs is really easy with Camlp4.
> Do I gain a lot with grammars over stream parsing?
Swings and roundabouts, IMHO. Camlp4 is higher level, more capable and the
syntax is clearer but the documentation is so poor that I have given up every
time I have tried to use it either because the default lexer was insufficient
or because I could not figure out how to extract the necessary data from the
OCaml grammar.
Matthieu's example looks fantastic though...
--
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?e
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re : [Caml-list] camlp4 stream parser syntax
2009-03-08 0:12 ` Jon Harrop
@ 2009-03-08 0:20 ` Matthieu Wipliez
2009-03-08 0:29 ` Jon Harrop
2009-03-08 0:30 ` Re : " Joel Reymont
0 siblings, 2 replies; 36+ messages in thread
From: Matthieu Wipliez @ 2009-03-08 0:20 UTC (permalink / raw)
To: caml-list
[-- Attachment #1: Type: text/plain, Size: 1262 bytes --]
Joel asked me the parser so I gave it to him, but maybe it can be of use for others, so here it is.
Apart from the code specific to the application, it gives a good example of a complete Camlp4 lexer/parser for a language.
Note that for the lexer I started from a custom lexer made by Pietro Abate ( https://www.cduce.org/~abate/how-add-a-custom-lexer-camlp4 ) from the cduce lexer.
Cheers,
Matthieu
----- Message d'origine ----
> Swings and roundabouts, IMHO. Camlp4 is higher level, more capable and the
> syntax is clearer but the documentation is so poor that I have given up every
> time I have tried to use it either because the default lexer was insufficient
> or because I could not figure out how to extract the necessary data from the
> OCaml grammar.
>
> Matthieu's example looks fantastic though...
>
> --
> Dr Jon Harrop, Flying Frog Consultancy Ltd.
> http://www.ffconsultancy.com/?e
>
> _______________________________________________
> Caml-list mailing list. Subscription management:
> http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
> Archives: http://caml.inria.fr
> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
> Bug reports: http://caml.inria.fr/bin/caml-bugs
[-- Attachment #2: cal_parser.ml --]
[-- Type: application/octet-stream, Size: 27492 bytes --]
(*****************************************************************************)
(* Cal2C *)
(* Copyright (c) 2007-2008, IETR/INSA of Rennes. *)
(* All rights reserved. *)
(* *)
(* This software is governed by the CeCILL-B license under French law and *)
(* abiding by the rules of distribution of free software. You can use, *)
(* modify and/ or redistribute the software under the terms of the CeCILL-B *)
(* license as circulated by CEA, CNRS and INRIA at the following URL *)
(* "http://www.cecill.info". *)
(* *)
(* Matthieu WIPLIEZ <Matthieu.Wipliez@insa-rennes.fr *)
(*****************************************************************************)
open Cal2c_util
open Printf
let time = ref 0.
(* Camlp4 stuff *)
module Camlp4Loc = Camlp4.Struct.Loc
module Lexer = Cal_lexer.Make(Camlp4Loc)
module Gram = Camlp4.Struct.Grammar.Static.Make(Lexer)
(** [convert_loc _loc] returns a [Loc.t] from a [Camlp4.Struct.Loc.t]. *)
let convert_loc _loc =
let (file_name, start_line, start_bol, start_off,
stop_line, stop_bol, stop_off, _) = Camlp4Loc.to_tuple _loc in
{
Loc.file_name = file_name;
Loc.start = {Loc.line = start_line; Loc.bol = start_bol; Loc.off = start_off };
Loc.stop = {Loc.line = stop_line; Loc.bol = stop_bol; Loc.off = stop_off };
}
open Lexer
(*****************************************************************************)
(*****************************************************************************)
(*****************************************************************************)
(** [bop _loc e1 op e2] returns [Calast.ExprBOp (convert_loc _loc, e1, op, e2)] *)
let bop _loc e1 op e2 = Calast.ExprBOp (convert_loc _loc, e1, op, e2)
(** [uop _loc e1 op e2] returns [Calast.ExprUOp (convert_loc _loc, e, op)] *)
let uop _loc op e = Calast.ExprUOp (convert_loc _loc, op, e)
(*****************************************************************************)
(*****************************************************************************)
(*****************************************************************************)
(* Type definitions *)
(** Defines different kinds of type attributes. *)
type type_attr =
| ExprAttr of Calast.expr (** A type attribute that references an expression. *)
| TypeAttr of Calast.type_def (** A type attribute that references a type. *)
(** [find_size _loc typeAttrs] attemps to find a [type_attr] named ["size"]
that is an [ExprAttr]. The function returns a [Calast.expr]. *)
let find_size _loc typeAttrs =
let attr =
List.assoc "size" typeAttrs
in
match attr with
| ExprAttr e -> e
| _ ->
Asthelper.failwith (convert_loc _loc) "size must be an expression!"
(** [find_type _loc typeAttrs] attemps to find a [type_attr] named ["type"]
that is an [TypeAttr]. The function returns a [Calast.type_def]. *)
let find_type _loc typeAttrs =
let attr =
List.assoc "type" typeAttrs
in
match attr with
| TypeAttr t -> t
| _ -> Asthelper.failwith (convert_loc _loc) "type must be a type!"
(** [find_size_or_default _loc typeAttrs] attemps to find a [type_attr]
named ["size"] that is an [ExprAttr]. If not found, the function returns the
default size given as an [int]. *)
let find_size_or_default _loc typeAttrs default =
(* size in bits *)
try
find_size _loc typeAttrs
with Not_found ->
(* no size information found, assuming "default" bits. *)
Calast.ExprInt (convert_loc _loc, default)
(** [type_of_typeDef _loc name typeAttrs] returns a [Calast.type_def] from a
name and type attributes that were parsed. *)
let type_of_typeDef _loc name typeAttrs =
match name with
| "bool" -> Calast.TypeBool
| "int" -> Calast.TypeInt (find_size_or_default _loc typeAttrs 32)
| "list" ->
Asthelper.failwith (convert_loc _loc)
"The type \"list\" is deprecated. Please use \"List\"."
| "List" ->
(* get a type *)
let t =
try
find_type _loc typeAttrs
with Not_found ->
Asthelper.failwith (convert_loc _loc)
"RVC-CAL requires that all lists have a type."
in
(* and a size in number of elements *)
let size =
try
find_size _loc typeAttrs
with Not_found ->
Asthelper.failwith (convert_loc _loc)
"RVC-CAL requires that all lists have a size."
in
Calast.TypeList (t, size)
| "string" ->
Asthelper.failwith (convert_loc _loc)
"The type \"string\" is deprecated. Please use \"String\"."
| "String" -> Calast.TypeStr
| "uint" -> Calast.TypeUint (find_size_or_default _loc typeAttrs 32)
| t ->
let message = "The type \"" ^ t ^ "\" is not known.\n\
Did you want to declare a variable \"" ^ t ^ "\"? \
If that is the case please specify its type." in
Asthelper.failwith (convert_loc _loc) message
(*****************************************************************************)
(*****************************************************************************)
(*****************************************************************************)
(* Actor definitions. *)
(** Defines different kinds of actor declarations. *)
type actor_decl =
| Action of Calast.action (** An action of type [Calast.action]. *)
| FuncDecl of Calast.func (** A function declaration at the actor level. *)
| Initialization of Calast.action (** An initialization action of type [Calast.action]. *)
| PriorityOrder of Calast.tag list list (** An actor declaration of type priority order. *)
| ProcDecl of Calast.proc (** A procedure declaration at the actor level. *)
| VarDecl of Calast.var_info (** A variable declaration at the actor level. *)
let get_something pred map declarations =
let (actions, declarations) = List.partition pred declarations in
let actions = List.map map actions in
(actions, declarations)
(** [get_actions declarations] returns a tuple [(actions, declarations)] where
actions is a list of actions and declarations the remaining declarations. *)
let get_actions declarations =
get_something
(function Action _ -> true | _ -> false)
(function | Action a -> a | _ -> failwith "never happens")
declarations
(** [get_funcs declarations] returns a tuple [(funcs, declarations)] where
funcs is a list of function declarations and [declarations] the
remaining declarations. *)
let get_funcs declarations =
get_something
(function FuncDecl _ -> true | _ -> false)
(function | FuncDecl f -> f | _ -> failwith "never happens")
declarations
(** [get_priorities declarations] returns a tuple [(priorities, declarations)] where
priorities is a list of priorities and declarations the remaining declarations. *)
let get_priorities declarations =
let (priorities, declarations) =
get_something
(function PriorityOrder _ -> true | _ -> false)
(function | PriorityOrder p -> p | _ -> failwith "never happens")
declarations
in
let priorities = List.flatten priorities in
(priorities, declarations)
(** [get_funcs declarations] returns a tuple [(funcs, declarations)] where
funcs is a list of function declarations and [declarations] the
remaining declarations. *)
let get_procs declarations =
get_something
(function ProcDecl _ -> true | _ -> false)
(function | ProcDecl p -> p | _ -> failwith "never happens")
declarations
(** [get_initializes declarations] returns a tuple [(initializes, declarations)]
where initializes is a list of initialize and declarations the remaining
declarations. *)
let get_initializes declarations =
get_something
(function Initialization _ -> true | _ -> false)
(function | Initialization i -> i | _ -> failwith "never happens")
declarations
(** [get_vars declarations] returns a tuple [(vars, declarations)] where
vars is a list of local variable declarations and [declarations] the
remaining declarations. *)
let get_vars declarations =
get_something
(function VarDecl _ -> true | _ -> false)
(function | VarDecl v -> v | _ -> failwith "never happens")
declarations
let var assignable global loc name t v =
{ Calast.v_assignable = assignable;
v_global = global;
v_loc = loc;
v_name = name;
v_type = t;
v_value = v }
(*****************************************************************************)
(*****************************************************************************)
(*****************************************************************************)
(* Rule declarations *)
let actor = Gram.Entry.mk "actor"
let actorActionOrInit = Gram.Entry.mk "actorActionOrInit"
let actorDeclarations = Gram.Entry.mk "actorDeclarations"
let actorImport = Gram.Entry.mk "actorImport"
let actorPars = Gram.Entry.mk "actorPars"
let actorPortDecls = Gram.Entry.mk "actorPortDecls"
let action = Gram.Entry.mk "action"
let actionChannelSelector = Gram.Entry.mk "actionChannelSelector"
let actionChannelSelectorNames = Gram.Entry.mk "actionChannelSelectorNames"
let actionDelay = Gram.Entry.mk "actionDelay"
let actionGuards = Gram.Entry.mk "actionGuards"
let actionInputs = Gram.Entry.mk "actionInputs"
let actionOutputs = Gram.Entry.mk "actionOutputs"
let actionRepeat = Gram.Entry.mk "actionRepeat"
let actionStatements = Gram.Entry.mk "actionStatements"
let actionTag = Gram.Entry.mk "actionTag"
let actionTokenNames = Gram.Entry.mk "actionTokenNames"
let expression = Gram.Entry.mk "expression"
let expressionGenerators = Gram.Entry.mk "expressionGenerators"
let expressionGeneratorsOpt = Gram.Entry.mk "expressionGeneratorsOpt"
let expressions = Gram.Entry.mk "expressions"
let ident = Gram.Entry.mk "ident"
let initializationAction = Gram.Entry.mk "initializationAction"
let qualifiedId = Gram.Entry.mk "qualifiedId"
let priorityInequality = Gram.Entry.mk "priorityInequality"
let priorityOrder = Gram.Entry.mk "priorityOrder"
let schedule = Gram.Entry.mk "schedule"
let stateTransition = Gram.Entry.mk "stateTransition"
let stateTransitions = Gram.Entry.mk "stateTransitions"
let statements = Gram.Entry.mk "statements"
let statementForEachIdents = Gram.Entry.mk "statementForEachIdents"
let statementIfElseOpt = Gram.Entry.mk "statementIfElseOpt"
let typeAttrs = Gram.Entry.mk "typeAttrs"
let typeDef = Gram.Entry.mk "typeDef"
let typePars = Gram.Entry.mk "typePars"
let typeParsOpt = Gram.Entry.mk "typeParsOpt"
let varDecl = Gram.Entry.mk "varDecl"
let varDeclFunctionParams = Gram.Entry.mk "varDeclFunctionParams"
let varDeclNoExpr = Gram.Entry.mk "varDeclNoExpr"
let varDecls = Gram.Entry.mk "varDecls"
let varDeclsAndDoOpt = Gram.Entry.mk "varDeclsAndDoOpt"
let varDeclsOpt = Gram.Entry.mk "varDeclsOpt"
(* Grammar definition *)
EXTEND Gram
(***************************************************************************)
(* an action. *)
action: [
[ inputs = actionInputs; "==>"; outputs = actionOutputs;
guards = actionGuards;
OPT actionDelay;
decls = varDeclsOpt;
stmts = actionStatements;
"end" ->
{
Calast.a_guards = guards;
a_inputs = inputs;
a_loc = convert_loc _loc;
a_outputs = outputs;
a_stmts = stmts;
a_tag = []; (* the tag is filled in the actorDeclarations rule. *)
a_vars = decls;
}
]
];
actionChannelSelector: [
[ actionChannelSelectorNames ->
Asthelper.failwith (convert_loc _loc)
"RVC-CAL does not support channel selectors." ]
];
actionChannelSelectorNames: [ [ "at" | "at*" | "any" | "all" ] ];
actionDelay: [ [ "delay"; expression ->
Asthelper.failwith (convert_loc _loc)
"RVC-CAL does not permit the use of delay." ] ];
actionGuards: [ [ "guard"; e = expressions -> e | -> [] ] ];
(* action inputs *)
actionInputs: [
[ l = LIST0 [
"["; tokens = actionTokenNames; "]"; repeat = actionRepeat; OPT actionChannelSelector ->
("", tokens, repeat)
| (_, portName) = ident; ":"; "["; tokens = actionTokenNames; "]"; repeat = actionRepeat; OPT actionChannelSelector ->
(portName, tokens, repeat)
] SEP "," -> l ]
];
(* action outputs *)
actionOutputs: [
[ l = LIST0 [
"["; exprs = expressions; "]"; repeat = actionRepeat; OPT actionChannelSelector ->
("", exprs, repeat)
| (_, portName) = ident; ":"; "["; exprs = expressions; "]"; repeat = actionRepeat; OPT actionChannelSelector ->
(portName, exprs, repeat)
] SEP "," -> l ]
];
actionRepeat: [
[ "repeat"; e = expression -> e
| -> Calast.ExprInt (convert_loc _loc, 1) ]
];
actionStatements: [ [ "do"; s = statements -> s | -> [] ] ];
actionTag: [ [ tag = LIST1 [ (_, id) = ident -> id ] SEP "." -> tag ] ];
actionTokenNames: [
[ tokens = LIST0 [ (loc, id) = ident -> (loc, id) ] SEP "," -> tokens ]
];
(***************************************************************************)
(* a CAL actor. *)
actor: [
[ LIST0 actorImport; "actor"; (_, name) = ident; typeParsOpt;
"("; parameters = actorPars; ")";
inputs = actorPortDecls; "==>"; outputs = actorPortDecls; ":";
declarations1 = actorDeclarations;
fsm = OPT schedule;
declarations2 = actorDeclarations;
"end"; `EOI ->
let declarations = List.append declarations1 declarations2 in
let (actions, declarations) = get_actions declarations in
let (funcs, declarations) = get_funcs declarations in
let (priorities, declarations) = get_priorities declarations in
let (procs, declarations) = get_procs declarations in
let (vars, declarations) = get_vars declarations in
let (_initializes, declarations) = get_initializes declarations in
assert (declarations = []);
{
Calast.ac_actions = actions;
ac_fsm = fsm;
ac_funcs = funcs;
ac_inputs = inputs;
ac_name = name;
ac_outputs = outputs;
ac_parameters = parameters;
ac_priorities = priorities;
ac_procs = procs;
ac_vars = vars;
}
]
];
actorActionOrInit: [
[ "action"; a = action -> Action a
| "initialize"; i = initializationAction -> Initialization i ]
];
(* declarations in the actor body. A few rules are duplicated here because
the grammar is not LL(1). In contrast with CLR, functions and procedures
may only be declared at this level. Cal2C does not support nested function
declarations. *)
actorDeclarations: [
[ l = LIST0 [
"action"; a = action -> Action a
| "function"; (_, n) = ident; "("; p = varDeclFunctionParams; ")";
"-->"; t = typeDef; v = varDeclsOpt; ":"; e = expression; "end" ->
FuncDecl {
Calast.f_decls = v;
f_expr = e;
f_loc = convert_loc _loc;
f_name = n;
f_params = p;
f_return = t;
}
| "procedure"; (_, n) = ident; "("; p = varDeclFunctionParams; ")";
v = varDeclsOpt; "begin"; s = statements; "end" ->
ProcDecl {
Calast.p_decls = v;
p_loc = convert_loc _loc;
p_name = n;
p_params = p;
p_stmts = s
}
| "initialize"; i = initializationAction -> Initialization i
| "priority"; p = priorityOrder -> PriorityOrder p
| (_, tag) = ident; ":"; a = actorActionOrInit ->
(match a with
| Action a -> Action {a with Calast.a_tag = [tag]}
| Initialization a -> Initialization {a with Calast.a_tag = [tag]}
| _ -> failwith "never happens")
| (_, tag) = ident; "."; tags = actionTag; ":"; a = actorActionOrInit ->
(match a with
| Action a -> Action {a with Calast.a_tag = tag :: tags}
| Initialization a -> Initialization {a with Calast.a_tag = tag :: tags}
| _ -> failwith "never happens")
| ident; "["; typePars; "]" ->
Asthelper.failwith (convert_loc _loc) "RVC-CAL does not support type parameters."
| (_, name) = ident; (var_loc, var_name) = ident; ";" ->
let t = type_of_typeDef _loc name [] in
VarDecl (var true true var_loc var_name t None)
| (_, name) = ident; (var_loc, var_name) = ident; "="; e = expression; ";" ->
let t = type_of_typeDef _loc name [] in
VarDecl (var false true var_loc var_name t (Some e))
| (_, name) = ident; (var_loc, var_name) = ident; ":="; e = expression; ";" ->
let t = type_of_typeDef _loc name [] in
VarDecl (var true true var_loc var_name t (Some e))
| (_, name) = ident; "("; attrs = typeAttrs; ")"; (var_loc, var_name) = ident; ";" ->
let t = type_of_typeDef _loc name attrs in
VarDecl (var true true var_loc var_name t None)
| (_, name) = ident; "("; attrs = typeAttrs; ")";
(var_loc, var_name) = ident; "="; e = expression; ";" ->
let t = type_of_typeDef _loc name attrs in
VarDecl (var false true var_loc var_name t (Some e))
| (_, name) = ident; "("; attrs = typeAttrs; ")";
(var_loc, var_name) = ident; ":="; e = expression; ";" ->
let t = type_of_typeDef _loc name attrs in
VarDecl (var true true var_loc var_name t (Some e))
| (_, i) = ident; ";" ->
Asthelper.failwith (convert_loc _loc)
("Missing type for declaration of \"" ^ i ^ "\".")
| (_, i) = ident; "="; expression; ";" ->
Asthelper.failwith (convert_loc _loc)
("Missing type for declaration of \"" ^ i ^ "\".")
| (_, i) = ident; ":="; expression; ";" ->
Asthelper.failwith (convert_loc _loc)
("Missing type for declaration of \"" ^ i ^ "\".")
] -> l ]
];
(* stuff imported by the current actor *)
actorImport: [
[ "import"; "all"; qualifiedId; ";" -> ()
| "import"; qualifiedId; ";" -> () ]
];
(* actor parameters: type, name and optional expression. *)
actorPars: [
[ parameters = LIST0 [
t = typeDef; (_, name) = ident; v = OPT [ "="; e = expression -> e ] ->
var false true (convert_loc _loc) name t v
] SEP "," -> parameters ]
];
(* a port declaration: "multi" or not, type and identifier. *)
actorPortDecls: [
[ l = LIST0 [
OPT "multi"; t = typeDef; (_, name) = ident ->
var false true (convert_loc _loc) name t None
] SEP "," -> l ]
];
(***************************************************************************)
(* expressions. *)
expression: [
"top"
[ "["; e = expressions; g = expressionGeneratorsOpt; "]" ->
Calast.ExprList (convert_loc _loc, e, g)
| "if"; e1 = SELF; "then"; e2 = expression; "else"; e3 = expression; "end" ->
Calast.ExprIf (convert_loc _loc, e1, e2, e3) ]
| "or"
[ e1 = SELF; "or"; e2 = SELF -> bop _loc e1 Calast.BOpOr e2 ]
| "and"
[ e1 = SELF; "and"; e2 = SELF -> bop _loc e1 Calast.BOpAnd e2 ]
| "cmp"
[ e1 = SELF; "="; e2 = SELF -> bop _loc e1 Calast.BOpEQ e2
| e1 = SELF; "!="; e2 = SELF -> bop _loc e1 Calast.BOpNE e2
| e1 = SELF; "<"; e2 = SELF -> bop _loc e1 Calast.BOpLT e2
| e1 = SELF; "<="; e2 = SELF -> bop _loc e1 Calast.BOpLE e2
| e1 = SELF; ">"; e2 = SELF -> bop _loc e1 Calast.BOpGT e2
| e1 = SELF; ">="; e2 = SELF -> bop _loc e1 Calast.BOpGE e2 ]
| "add"
[ e1 = SELF; "+"; e2 = SELF -> bop _loc e1 Calast.BOpPlus e2
| e1 = SELF; "-"; e2 = SELF -> bop _loc e1 Calast.BOpMinus e2 ]
| "mul"
[ e1 = SELF; "div"; e2 = SELF -> bop _loc e1 Calast.BOpDivInt e2
| e1 = SELF; "mod"; e2 = SELF -> bop _loc e1 Calast.BOpMod e2
| e1 = SELF; "*"; e2 = SELF -> bop _loc e1 Calast.BOpTimes e2
| e1 = SELF; "/"; e2 = SELF -> bop _loc e1 Calast.BOpDiv e2 ]
| "exp"
[ e1 = SELF; "^"; e2 = SELF -> bop _loc e1 Calast.BOpExp e2 ]
| "unary"
[ "-"; e = SELF -> uop _loc Calast.UOpMinus e
| "not"; e = SELF -> uop _loc Calast.UOpNot e
| "#"; e = SELF -> uop _loc Calast.UOpNbElts e ]
| "simple"
[ "("; e = SELF; ")" -> e
| "true" -> Calast.ExprBool (convert_loc _loc, true)
| "false" -> Calast.ExprBool (convert_loc _loc, false)
| (i, _) = INT -> Calast.ExprInt (convert_loc _loc, i)
| (s, _) = STRING -> Calast.ExprStr (convert_loc _loc, s)
| (_, v) = ident; "("; el = expressions; ")" ->
Calast.ExprCall (convert_loc _loc, v, el)
| (loc, v) = ident; el = LIST1 [ "["; e = expression; "]" -> e ] ->
Calast.ExprIdx (convert_loc _loc, (loc, v), el)
| (loc, v) = ident -> Calast.ExprVar (loc, v) ]
];
expressionGenerators: [
[ l = LIST1 [
"for"; t = typeDef; (loc, name) = ident; "in"; e = expression ->
let var = var false false loc name t None in
(var, e) ] SEP "," -> l ]
];
expressionGeneratorsOpt: [ [ ":"; g = expressionGenerators -> g | -> [] ] ];
expressions: [ [ l = LIST0 [ e = expression -> e ] SEP "," -> l ] ];
ident: [ [ s = IDENT -> (convert_loc _loc, s) ] ];
(***************************************************************************)
(* initialization action. *)
initializationAction: [
[ "==>"; outputs = actionOutputs;
guards = actionGuards; OPT actionDelay; decls = varDeclsOpt;
stmts = actionStatements;
"end" ->
{
Calast.a_guards = guards;
a_inputs = [];
a_loc = convert_loc _loc;
a_outputs = outputs;
a_stmts = stmts;
a_tag = []; (* the tag is filled in the actorDeclarations rule. *)
a_vars = decls;
}
]
];
(***************************************************************************)
qualifiedId: [ [ qid = LIST1 [ id = ident -> id ] SEP "." -> qid ] ];
(***************************************************************************)
(* schedule and priorities. We only support FSM schedules. *)
priorityInequality: [
[ tag = actionTag; ">"; tags = LIST1 [a = actionTag -> a ] SEP ">" -> tag :: tags ]
];
priorityOrder: [ [ l = LIST0 [ p = priorityInequality; ";" -> p ]; "end" -> l ] ];
schedule: [
[ "schedule"; "fsm"; (_, first_state) = ident; ":";
transitions = stateTransitions; "end" -> (first_state, transitions)
| "schedule"; "regexp" ->
Asthelper.failwith (convert_loc _loc) "RVC-CAL does not support \"regexp\" schedules."
]
];
stateTransition: [
[ (_, from_state) = ident; "("; action = actionTag; ")"; "-->"; (_, to_state) = ident; ";" ->
(from_state, action, to_state) ]
];
stateTransitions: [ [ l = LIST0 [ t = stateTransition -> t ] -> l ] ];
(***************************************************************************)
(* statements: while, for, if... *)
statements: [
[ l = LIST0 [
"begin"; decls = varDeclsAndDoOpt; st = statements; "end" ->
Calast.StmtBlock (convert_loc _loc, decls, st)
| "choose" ->
Asthelper.failwith (convert_loc _loc)
"RVC-CAL does not support the \"choose\" statement."
| "for" ->
Asthelper.failwith (convert_loc _loc)
"RVC-CAL does not support the \"for\" statement, please use \"foreach\" instead."
| "foreach"; var = varDeclNoExpr; "in"; e = expression;
v = varDeclsOpt; "do"; s = statements; "end" ->
Calast.StmtForeach (convert_loc _loc, var, e, v, s)
| "foreach"; typeDef; ident; "in"; expression; ".."; expression ->
Asthelper.failwith (convert_loc _loc)
"RVC-CAL does not support the \"..\" construct, please use \"Integers\" instead."
| "if"; e = expression; "then"; s1 = statements; s2 = statementIfElseOpt; "end" ->
Calast.StmtIf (convert_loc _loc, e, s1, s2)
| "while"; e = expression; decls = varDeclsOpt; "do"; s = statements; "end" ->
Calast.StmtWhile (convert_loc _loc, e, decls, s)
| (loc, v) = ident; "["; el = expressions; "]"; ":="; e = expression; ";" ->
Calast.StmtInstr (convert_loc _loc,
[Calast.InstrAssignArray (convert_loc _loc, (loc, v), el, e)])
| (_, v) = ident; "."; (_, f) = ident; ":="; e = expression; ";" ->
Calast.StmtInstr (convert_loc _loc,
[Calast.InstrAssignField (convert_loc _loc, v, f, e)])
| (loc, v) = ident; ":="; e = expression; ";" ->
Calast.StmtInstr (convert_loc _loc,
[Calast.InstrAssignVar (convert_loc _loc, (loc, v), e)])
| (_, v) = ident; "("; el = expressions; ")"; ";" ->
Calast.StmtInstr (convert_loc _loc,
[Calast.InstrCall (convert_loc _loc, v, el)])
| (_, v) = ident; "."; (_, m) = ident; "("; el = expressions; ")";
LIST0 [ "."; ident; "("; expressions; ")" ]; ";" ->
Calast.StmtInstr (convert_loc _loc,
[Calast.InstrCallMethod (convert_loc _loc, v, m, el)])
] -> l ]
];
statementForEachIdents: [ [ l = LIST1 [ t = typeDef; (loc, name) = ident ->
var false false loc name t None
] -> l ] ];
statementIfElseOpt: [ [ "else"; s = statements -> s | -> [] ] ];
(***************************************************************************)
(* a type attribute, such as "type:" and "size=" *)
typeAttrs: [
[ l = LIST0 [
(_, attr) = ident; ":"; t = typeDef -> (attr, TypeAttr t)
| (_, attr) = ident; "="; e = expression -> (attr, ExprAttr e)
] SEP "," -> l ]
];
(* a type definition: bool, int(size=5), list(type:int, size=10)... *)
typeDef: [
[ (_, name) = ident -> type_of_typeDef _loc name []
| ident; "["; typePars; "]" ->
Asthelper.failwith (convert_loc _loc) "RVC-CAL does not support type parameters."
| (_, name) = ident; "("; attrs = typeAttrs; ")" ->
type_of_typeDef _loc name attrs ]
];
(* type parameters, not supported at this point. *)
typePars: [ [ LIST0 [ IDENT -> () | IDENT; "<"; typeDef -> () ] SEP "," -> () ] ];
typeParsOpt: [
[ "["; typePars; "]" ->
Asthelper.failwith (convert_loc _loc) "RVC-CAL does not support type parameters."
| ]
];
(***************************************************************************)
(* variable declarations. *)
(* we do not support nested declarations of functions nor procedures. *)
varDecl: [
[ t = typeDef; (loc, name) = ident; "="; e = expression ->
var false false loc name t (Some e)
| t = typeDef; (loc, name) = ident; ":="; e = expression ->
var true false loc name t (Some e)
| t = typeDef; (loc, name) = ident -> var true false loc name t None ]
];
(* t = typeDef; (loc, name) = ident -> var false false loc name t None *)
varDeclFunctionParams: [
[ l = LIST0
[ t = typeDef; (loc, name) = ident -> var true false loc name t None
] SEP "," -> l ]
];
varDeclNoExpr: [
[ t = typeDef; (loc, name) = ident -> var false false loc name t None
]
];
varDecls: [ [ l = LIST1 [ v = varDecl -> v] SEP "," -> l ] ];
varDeclsAndDoOpt: [ [ "var"; decls = varDecls; "do" -> decls | -> [] ] ];
varDeclsOpt: [ [ "var"; decls = varDecls -> decls | -> [] ] ];
END
(*****************************************************************************)
(* additional grammar for -D <type> <name> = <value> *)
let arg = Gram.Entry.mk "arg"
(* Grammar definition *)
EXTEND Gram
arg: [
[ (loc, name) = ident; "="; e = expression ->
var false true loc name Calast.TypeUnknown (Some e) ]
];
END
let parse_with_msg f rule loc stream =
try
f rule loc stream
with Camlp4Loc.Exc_located (loc, exn) ->
(match exn with
| Stream.Error err -> fprintf stderr "%s\n%s\n" (Camlp4Loc.to_string loc) err
| _ -> fprintf stderr "%s\n%s\n" (Camlp4Loc.to_string loc) (Printexc.to_string exn));
exit (-1)
(** [parse_actor path] parses the file whose absolute path is given by [path]
and returns a [Calast.actor]. If anything goes wrong, Cal2C exists. *)
let parse_actor file =
let t1 = Sys.time () in
let ch = open_in file in
let actor =
parse_with_msg Gram.parse actor (Loc.mk file) (Stream.of_channel ch)
in
close_in ch;
let t2 = Sys.time () in
time := !time +. t2 -. t1;
actor
(** [parse_arg str] parses the string [str] as a variable declaration,
and returns a [Calast.var_decl]. If anything goes wrong, Cal2C exits. *)
let parse_arg str =
parse_with_msg Gram.parse arg (Loc.mk str) (Stream.of_string str)
let parse_expr str =
parse_with_msg Gram.parse expression (Loc.mk str) (Stream.of_string str)
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [Caml-list] camlp4 stream parser syntax
2009-03-08 0:20 ` Re : " Matthieu Wipliez
@ 2009-03-08 0:29 ` Jon Harrop
2009-03-08 0:30 ` Re : " Joel Reymont
1 sibling, 0 replies; 36+ messages in thread
From: Jon Harrop @ 2009-03-08 0:29 UTC (permalink / raw)
To: caml-list
On Sunday 08 March 2009 00:20:06 Matthieu Wipliez wrote:
> Joel asked me the parser so I gave it to him, but maybe it can be of use
> for others, so here it is. Apart from the code specific to the application,
> it gives a good example of a complete Camlp4 lexer/parser for a language.
>
> Note that for the lexer I started from a custom lexer made by Pietro Abate
> ( https://www.cduce.org/~abate/how-add-a-custom-lexer-camlp4 ) from the
> cduce lexer.
These are really wonderful examples, thank you!
I had no idea Camlp4 had been used to write such non-trivial parsers...
--
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?e
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Re : [Caml-list] camlp4 stream parser syntax
2009-03-08 0:20 ` Re : " Matthieu Wipliez
2009-03-08 0:29 ` Jon Harrop
@ 2009-03-08 0:30 ` Joel Reymont
2009-03-08 0:37 ` Re : " Matthieu Wipliez
1 sibling, 1 reply; 36+ messages in thread
From: Joel Reymont @ 2009-03-08 0:30 UTC (permalink / raw)
To: Matthieu Wipliez; +Cc: caml-list
On Mar 8, 2009, at 12:20 AM, Matthieu Wipliez wrote:
> Joel asked me the parser so I gave it to him, but maybe it can be of
> use for others, so here it is.
While we are on this subject... How do you troubleshoot camlp4 rules?
With a stream parser you can invoke individual functions since each is
a full-blown parser. Can the same be done with camlp4, e.g. individual
rules invoked?
Can rules be traced to see which ones are being taken?
Thanks, Joel
---
http://tinyco.de
Mac, C++, OCaml
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re : Re : [Caml-list] camlp4 stream parser syntax
2009-03-08 0:30 ` Re : " Joel Reymont
@ 2009-03-08 0:37 ` Matthieu Wipliez
0 siblings, 0 replies; 36+ messages in thread
From: Matthieu Wipliez @ 2009-03-08 0:37 UTC (permalink / raw)
To: caml-list
> While we are on this subject... How do you troubleshoot camlp4 rules?
Not sure what you mean :(
> With a stream parser you can invoke individual functions since each is a
> full-blown parser. Can the same be done with camlp4, e.g. individual rules
> invoked?
Well when you invoke the parser with Gram.parse, you give it the entry point. So you may parse only a subset of your language if the grammar allows it.
> Can rules be traced to see which ones are being taken?
Erm, I don't really know... You can always printf when a rule is taken, but I'm not aware of a built-in construct that allows you to monitor the rules that are taken.
Cheers,
Matthieu
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Re : [Caml-list] Re: camlp4 stream parser syntax
2009-03-07 23:21 ` Re : [Caml-list] " Matthieu Wipliez
2009-03-07 23:42 ` Joel Reymont
@ 2009-03-08 0:40 ` Joel Reymont
2009-03-08 1:08 ` Re : " Matthieu Wipliez
1 sibling, 1 reply; 36+ messages in thread
From: Joel Reymont @ 2009-03-08 0:40 UTC (permalink / raw)
To: Matthieu Wipliez; +Cc: O'Caml Mailing List
Matthieu,
Is the camlp4 grammar parser case-insensitive?
Will both Delay and delay be accepted in the actionDelay rule?
actionDelay: [ [ "delay"; expression ->
Asthelper.failwith (convert_loc _loc)
"RVC-CAL does not permit the use of delay." ] ];
Also, I noticed that your lexer has a really small token set, i.e.
type token =
| KEYWORD of string
| SYMBOL of string
| IDENT of string
| INT of int * string
| FLOAT of float * string
| CHAR of char * string
| STRING of string * string
| EOI
My custom lexer, on the other hand, has a HUGE token set, e.g.
type token =
| BUY_TO_COVER
| SELL_SHORT
| AT_ENTRY
| RANGE
| YELLOW
| WHITE
| WHILE
| UNTIL
...
This is partly because I have a very large set of keywords.
Do I correctly understand that I do not need all the keywords since I
can match them in the camlp4 grammar as strings like "BuyToCover",
"SellShort", etc.?
Thanks, Joel
---
http://tinyco.de
Mac, C++, OCaml
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re : Re : [Caml-list] Re: camlp4 stream parser syntax
2009-03-08 0:40 ` Joel Reymont
@ 2009-03-08 1:08 ` Matthieu Wipliez
2009-03-08 8:25 ` Joel Reymont
2009-03-08 9:34 ` Joel Reymont
0 siblings, 2 replies; 36+ messages in thread
From: Matthieu Wipliez @ 2009-03-08 1:08 UTC (permalink / raw)
To: Joel Reymont; +Cc: O'Caml Mailing List
> Matthieu,
>
> Is the camlp4 grammar parser case-insensitive?
>
> Will both Delay and delay be accepted in the actionDelay rule?
>
> actionDelay: [ [ "delay"; expression ->
> Asthelper.failwith (convert_loc _loc)
> "RVC-CAL does not permit the use of delay." ] ];
No, only "delay" is accepted.
> Also, I noticed that your lexer has a really small token set, i.e.
>
> type token =
> | KEYWORD of string
> | SYMBOL of string
> | IDENT of string
> | INT of int * string
> | FLOAT of float * string
> | CHAR of char * string
> | STRING of string * string
> | EOI
>
> My custom lexer, on the other hand, has a HUGE token set, e.g.
>
> type token =
> | BUY_TO_COVER
> | SELL_SHORT
> | AT_ENTRY
> | RANGE
> | YELLOW
> | WHITE
> | WHILE
> | UNTIL
> ...
>
> This is partly because I have a very large set of keywords.
>
> Do I correctly understand that I do not need all the keywords since I can match
> them in the camlp4 grammar as strings like "BuyToCover", "SellShort", etc.?
Yes that's right.
Also a good source of information, being given the status of Camlp4
documentation, is Camlp4 source code, especially camlp4/Camlp4Parsers/Camlp4OCamlRevisedParser.ml and
Camlp4OCamlParser.ml
> I had no idea Camlp4 had been used to write such non-trivial parsers...
Actually the aforementioned files show the power of Camlp4 parsing and grammar extension capabilities quite well IMHO.
Cheers,
Matthieu
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Re : Re : [Caml-list] Re: camlp4 stream parser syntax
2009-03-08 1:08 ` Re : " Matthieu Wipliez
@ 2009-03-08 8:25 ` Joel Reymont
2009-03-08 9:37 ` Daniel de Rauglaudre
2009-03-08 11:45 ` Re : Re : " Matthieu Wipliez
2009-03-08 9:34 ` Joel Reymont
1 sibling, 2 replies; 36+ messages in thread
From: Joel Reymont @ 2009-03-08 8:25 UTC (permalink / raw)
To: Matthieu Wipliez; +Cc: O'Caml Mailing List
How can I make camlp4 parsing case-insensitive?
The only approach I can think of so far is to build a really larger
set of tokens and use them instead of strings in the parser.
Any flag I can flip or way to do this without a large set of tokens?
Thanks, Joel
On Mar 8, 2009, at 1:08 AM, Matthieu Wipliez wrote:
>
>> Is the camlp4 grammar parser case-insensitive?
>>
>> Will both Delay and delay be accepted in the actionDelay rule?
>>
>> actionDelay: [ [ "delay"; expression ->
>> Asthelper.failwith (convert_loc _loc)
>> "RVC-CAL does not permit the use of delay." ] ];
>
> No, only "delay" is accepted.
---
http://tinyco.de
Mac, C++, OCaml
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Re : Re : [Caml-list] Re: camlp4 stream parser syntax
2009-03-08 1:08 ` Re : " Matthieu Wipliez
2009-03-08 8:25 ` Joel Reymont
@ 2009-03-08 9:34 ` Joel Reymont
1 sibling, 0 replies; 36+ messages in thread
From: Joel Reymont @ 2009-03-08 9:34 UTC (permalink / raw)
To: Matthieu Wipliez; +Cc: O'Caml Mailing List
On Mar 8, 2009, at 1:08 AM, Matthieu Wipliez wrote:
>> actionDelay: [ [ "delay"; expression ->
>> Asthelper.failwith (convert_loc _loc)
>> "RVC-CAL does not permit the use of delay." ] ];
Which of the following tokens does "delay" get checked against?
I'm assuming that camlp4 has to give "delay" to the lexer somehow and
ask the lexer if the next token matches "delay".
How does this happen?
>>
>> type token =
>> | KEYWORD of string
>> | SYMBOL of string
>> | IDENT of string
>> | INT of int * string
>> | FLOAT of float * string
>> | CHAR of char * string
>> | STRING of string * string
>> | EOI
Thanks, Joel
---
http://tinyco.de
Mac, C++, OCaml
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Re : Re : [Caml-list] Re: camlp4 stream parser syntax
2009-03-08 8:25 ` Joel Reymont
@ 2009-03-08 9:37 ` Daniel de Rauglaudre
2009-03-08 9:51 ` Joel Reymont
2009-03-08 11:45 ` Re : Re : " Matthieu Wipliez
1 sibling, 1 reply; 36+ messages in thread
From: Daniel de Rauglaudre @ 2009-03-08 9:37 UTC (permalink / raw)
To: caml-list
Hi
On Sun, Mar 08, 2009 at 08:25:23AM +0000, Joel Reymont wrote:
> How can I make camlp4 parsing case-insensitive?
I think it should work with doing the two following things (both):
1/ Change your lexer to generate case-insensitive tokens.
2/ Use the field "tok_match" of the interface with the lexer. Redefining
it allows you to match some token pattern with the corresponding token.
See doc (camlp5) in:
http://pauillac.inria.fr/~ddr/camlp5/doc/htmlc/grammars.html#b:The-lexer-record
In the example "default_match", change the test "if con = p_con" into
"if String.lowercase con = p_con".
Don't know if it still works with Camlp4, but you can often use the
Camlp5 documentation even for many Camlp4 features.
--
Daniel de Rauglaudre
http://pauillac.inria.fr/~ddr/
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Re : Re : [Caml-list] Re: camlp4 stream parser syntax
2009-03-08 9:37 ` Daniel de Rauglaudre
@ 2009-03-08 9:51 ` Joel Reymont
2009-03-08 10:27 ` Daniel de Rauglaudre
0 siblings, 1 reply; 36+ messages in thread
From: Joel Reymont @ 2009-03-08 9:51 UTC (permalink / raw)
To: Daniel de Rauglaudre; +Cc: caml-list
I would prefer to use the #2 approach but I'm using a custom lexer
built by ocamllex.
Where would I plug in String.lowercase con = ... in Matthieu's lexer,
for example?
Thanks, Joel
On Mar 8, 2009, at 9:37 AM, Daniel de Rauglaudre wrote:
> 2/ Use the field "tok_match" of the interface with the lexer.
> Redefining
> it allows you to match some token pattern with the corresponding
> token.
> See doc (camlp5) in:
> http://pauillac.inria.fr/~ddr/camlp5/doc/htmlc/
> grammars.html#b:The-lexer-record
> In the example "default_match", change the test "if con = p_con"
> into
> "if String.lowercase con = p_con".
---
http://tinyco.de
Mac, C++, OCaml
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re : [Caml-list] Re: camlp4 stream parser syntax
2009-03-08 9:51 ` Joel Reymont
@ 2009-03-08 10:27 ` Daniel de Rauglaudre
2009-03-08 10:35 ` Joel Reymont
0 siblings, 1 reply; 36+ messages in thread
From: Daniel de Rauglaudre @ 2009-03-08 10:27 UTC (permalink / raw)
To: caml-list
Hi,
On Sun, Mar 08, 2009 at 09:51:26AM +0000, Joel Reymont wrote:
> I would prefer to use the #2 approach but I'm using a custom lexer
> built by ocamllex.
Mmm... I am not eventually sure that what I said was correct... I should
test it myself, what I generally do before asserting things... :-)
But I was not clear: I said that you had to program *both* items. It
was not an "or" but an "and"...
But... it was false...
Bsakjfvouveoussasj.... I said nothing... I restart...
A change in the lexer should be sufficient.
If you cannot (or if you don't want):
Only changing the "tok_match" record field (2nd point) would not work
for keywords (defined by "just a string" in Camlp* grammars), because
the lexer *must* recognize all combinations of the identifier as
keywords, implying a change, anyway, in the lexer.
On the other hand, if you can accept that these identifiers are not
keywords (i.e. not reserved names), and if there a token for identifiers,
like "LIDENT" of "UIDENT" in Camlp* proposed lexer (module Plexer in
Camlp5), you can put them in your grammar as (for example):
LIDENT "delay"
instead of:
"delay"
In this case, a change of the "tok_match" record field should work.
Define the function:
let my_tok_match =
function
(p_con, "") ->
begin function (con, prm) ->
if con = p_con then prm else raise Stream.Failure
end
| (p_con, p_prm) ->
begin function (con, prm) ->
if String.lowercase con = p_con && prm = p_prm then prm
else raise Stream.Failure
end
;;
Then look for an identifier named "tok_match" in your code, which
should be a record field, and define that "tok_match" record field as
"my_tok_match".
If you don't find it, perhaps it is implicitely used by another Camlp*
library function. In this case, well, more work may have been done.
--
Daniel de Rauglaudre
http://pauillac.inria.fr/~ddr/
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Re : [Caml-list] Re: camlp4 stream parser syntax
2009-03-08 10:27 ` Daniel de Rauglaudre
@ 2009-03-08 10:35 ` Joel Reymont
2009-03-08 11:07 ` Joel Reymont
0 siblings, 1 reply; 36+ messages in thread
From: Joel Reymont @ 2009-03-08 10:35 UTC (permalink / raw)
To: Daniel de Rauglaudre; +Cc: caml-list
On Mar 8, 2009, at 10:27 AM, Daniel de Rauglaudre wrote:
> Only changing the "tok_match" record field (2nd point) would not work
> for keywords (defined by "just a string" in Camlp* grammars), because
> the lexer *must* recognize all combinations of the identifier as
> keywords, implying a change, anyway, in the lexer.
This is precisely what I'm trying to figure out.
What do I have to change in my _custom_ lexer generated by ocamllex to
recognize keywords defined by just a string in camlp4 grammars. I'm
not using LIDENT, etc. as I have my own set of tokens.
I understand that I need to downcase the keyword (or upcase) but I
don't understand where I need to do this.
The filter module nested in the token module seems like a good
candidate. What functions of the lexer or filter are accessed when a
string keyword (e.g. "delay") is found in the camlp4 grammar?
Thanks, Joel
---
http://tinyco.de
Mac, C++, OCaml
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Re : [Caml-list] Re: camlp4 stream parser syntax
2009-03-08 10:35 ` Joel Reymont
@ 2009-03-08 11:07 ` Joel Reymont
2009-03-08 11:28 ` Daniel de Rauglaudre
0 siblings, 1 reply; 36+ messages in thread
From: Joel Reymont @ 2009-03-08 11:07 UTC (permalink / raw)
To: Daniel de Rauglaudre, caml-list List
On Mar 8, 2009, at 10:35 AM, Joel Reymont wrote:
> The filter module nested in the token module seems like a good
> candidate. What functions of the lexer or filter are accessed when a
> string keyword (e.g. "delay") is found in the camlp4 grammar?
The filter portion of the token module looks like this (more below) ...
module Token = struct
module Loc = Loc
type t = token
...
module Filter = struct
type token_filter = (t, Loc.t) Camlp4.Sig.stream_filter
type t =
{ is_kwd : string -> bool;
mutable filter : token_filter }
let mk is_kwd =
{ is_kwd = is_kwd;
filter = fun s -> s }
let keyword_conversion tok is_kwd =
match tok with
SYMBOL s | IDENT s when is_kwd s -> KEYWORD s
| _ -> tok
...
end
end
The relevant part here is the function is_kwd : (string -> bool)
that's passed to Filter.mk. Within the bowels of OCaml a keyword hash
table is set up and used to manage keywords, e.g gkeywords in gram
below.
The functions using and removing (below) can be used to add and remove
keywords.
module Structure =
struct
open Sig.Grammar
module type S =
sig
module Loc : Sig.Loc
module Token : Sig.Token with module Loc = Loc
module Lexer : Sig.Lexer with module Loc = Loc
and module Token = Token
module Context : Context.S with module Token = Token
module Action : Sig.Grammar.Action
type gram =
{ gfilter : Token.Filter.t;
gkeywords : (string, int ref) Hashtbl.t;
glexer :
Loc.t -> char Stream.t -> (Token.t * Loc.t) Stream.t;
warning_verbose : bool ref; error_verbose : bool ref
}
type efun =
Context.t -> (Token.t * Loc.t) Stream.t -> Action.t
type token_pattern = ((Token.t -> bool) * string)
type internal_entry = ...
type production_rule = ((symbol list) * Action.t)
...
val get_filter : gram -> Token.Filter.t
val using : gram -> string -> unit
val removing : gram -> string -> unit
end
Matthieu is using this bit to parse
let parse_arg str =
parse_with_msg Gram.parse arg (Loc.mk str) (Stream.of_string str)
Should I just invoke Gram.using ... ? I feel that the solution is
staring me in the face here but I still can't recognize it. Help!!!
Thanks, Joel
---
http://tinyco.de
Mac, C++, OCaml
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Re : [Caml-list] Re: camlp4 stream parser syntax
2009-03-08 11:07 ` Joel Reymont
@ 2009-03-08 11:28 ` Daniel de Rauglaudre
0 siblings, 0 replies; 36+ messages in thread
From: Daniel de Rauglaudre @ 2009-03-08 11:28 UTC (permalink / raw)
To: caml-list
Hi,
On Sun, Mar 08, 2009 at 11:07:02AM +0000, Joel Reymont wrote:
> Should I just invoke Gram.using ... ? I feel that the solution is
> staring me in the face here but I still can't recognize it. Help!!!
Well, I am afraid it is probably Camlp4 (not 5). Nicolas Pouillard
probably could help, I don't know the details of the changes done
in Camlp4.
--
Daniel de Rauglaudre
http://pauillac.inria.fr/~ddr/
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re : Re : Re : [Caml-list] Re: camlp4 stream parser syntax
2009-03-08 8:25 ` Joel Reymont
2009-03-08 9:37 ` Daniel de Rauglaudre
@ 2009-03-08 11:45 ` Matthieu Wipliez
2009-03-08 11:52 ` Joel Reymont
1 sibling, 1 reply; 36+ messages in thread
From: Matthieu Wipliez @ 2009-03-08 11:45 UTC (permalink / raw)
To: O'Caml Mailing List
Since I don't know how to use the filter either, I tried to find another way :-)
In your lexer, do you have something along the lines of the "calc" examples in ocamllex official documentation, like a hash table that associates strings to tokens?
In this case, here is a possible solution, you have your hash table associate a lowercase version of the token with what you'd like to use in the grammar:
"buytocover" => "BuyToCover"
"sellshort" => "SellShort"
...
And you replace the lookup with
try
IDENT (Hashtbl.find keyword_table (String.lowercase id))
with Not_found ->
IDENT id
This way identifiers that when lower-cased look like "buytocover" ("BuYTOCovEr", "bUytOcOVeR", etc.) are replaced by a single token "BuyToCover", against which you match in the grammar.
Could this satisfy your requirements?
Cheers,
Matthieu
----- Message d'origine ----
> De : Joel Reymont <joelr1@gmail.com>
> À : Matthieu Wipliez <mwipliez@yahoo.fr>
> Cc : O'Caml Mailing List <caml-list@yquem.inria.fr>
> Envoyé le : Dimanche, 8 Mars 2009, 9h25mn 23s
> Objet : Re: Re : Re : [Caml-list] Re: camlp4 stream parser syntax
>
> How can I make camlp4 parsing case-insensitive?
>
> The only approach I can think of so far is to build a really larger
> set of tokens and use them instead of strings in the parser.
>
> Any flag I can flip or way to do this without a large set of tokens?
>
> Thanks, Joel
>
> On Mar 8, 2009, at 1:08 AM, Matthieu Wipliez wrote:
>
> >
> >> Is the camlp4 grammar parser case-insensitive?
> >>
> >> Will both Delay and delay be accepted in the actionDelay rule?
> >>
> >> actionDelay: [ [ "delay"; expression ->
> >> Asthelper.failwith (convert_loc _loc)
> >> "RVC-CAL does not permit the use of delay." ] ];
> >
> > No, only "delay" is accepted.
>
>
>
> ---
> http://tinyco.de
> Mac, C++, OCaml
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Re : Re : Re : [Caml-list] Re: camlp4 stream parser syntax
2009-03-08 11:45 ` Re : Re : " Matthieu Wipliez
@ 2009-03-08 11:52 ` Joel Reymont
2009-03-08 13:33 ` Re : " Matthieu Wipliez
0 siblings, 1 reply; 36+ messages in thread
From: Joel Reymont @ 2009-03-08 11:52 UTC (permalink / raw)
To: Matthieu Wipliez; +Cc: O'Caml Mailing List
On Mar 8, 2009, at 11:45 AM, Matthieu Wipliez wrote:
> In this case, here is a possible solution, you have your hash table
> associate a lowercase version of the token with what you'd like to
> use in the grammar:
> "buytocover" => "BuyToCover"
> "sellshort" => "SellShort"
> ...
I'm doing this already but I don't think it will do the trick with a
camlp4 parser since it goes through is_kwd to find a match when you
use "delay".
I think that the internal keyword hash table in the grammar needs to
be populated with lowercase keywords (by invoking 'using'). I don't
know how to get to the 'using' function yet, though.
Thanks, Joel
---
http://tinyco.de
Mac, C++, OCaml
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re : Re : Re : Re : [Caml-list] Re: camlp4 stream parser syntax
2009-03-08 11:52 ` Joel Reymont
@ 2009-03-08 13:33 ` Matthieu Wipliez
2009-03-08 13:59 ` Joel Reymont
0 siblings, 1 reply; 36+ messages in thread
From: Matthieu Wipliez @ 2009-03-08 13:33 UTC (permalink / raw)
To: Joel Reymont; +Cc: O'Caml Mailing List
> > In this case, here is a possible solution, you have your hash table associate
> a lowercase version of the token with what you'd like to use in the grammar:
> > "buytocover" => "BuyToCover"
> > "sellshort" => "SellShort"
> > ...
>
>
> I'm doing this already but I don't think it will do the trick with a camlp4
> parser since it goes through is_kwd to find a match when you use "delay".
I've just tested the idea with my lexer, in the rule identifier:
| identifier as ident {
if String.lowercase ident = "action" then
IDENT "ActioN"
else
IDENT ident
replacing entries in the grammar that match against "action" so they match against "ActioN".
In the source code, I have
reload: ActIon in8:[i]
shift: acTIon
And Camlp4 parses it correctly. I have a tentative explanation as why it works below:
> I think that the internal keyword hash table in the grammar needs to be
> populated with lowercase keywords (by invoking 'using'). I don't know how to get
> to the 'using' function yet, though.
I don't think so, here is what happens:
1) you preprocess your grammar with camlp4of. This transforms the EXTEND statements (and a lot of other stuff) to calls to Camlp4 modules/functions.
The grammar parser is in the Camlp4GrammarParser module.
In the rule "symbol", the entry | s = STRING -> matches strings (literal tokens) and produces a TXkwd s.
This is later transformed by make_expr to an expression Camlp4Grammar__.Skeyword s (quotation <:expr< $uid:gm$.Skeyword $str:kwd$ >>)
What this means is that at compile time an entry
my_rule : [ [ "BuyOrSell"; .. ] ]
gets transformed to an AST node
Skeyword "BuyOrSell"
You can see that by running "camlp4of" on the parser. Every rule gets transformed to a call to Gram.extend function, with Gram.Sopt, Gram.Snterm, Gram.Skeyword etc.
2) At runtime, when you start your program, all the Gram.extend calls are executed (because they are top-level). Your parser is kind of configured.
It turns out that extend is just a synonym for Insert.extend
(last line of Static module)
value extend = Insert.extend
This function will insert rules and tokens into Camlp4. The insert_tokens function tells us that whenever a Skeyword is seen, "using gram kwd" is called.
I believe this is the function you're referring to?
This function calls Structure.using, which basically add a keyword if necessary, and increase its reference count. (I think this is to automatically remove unused keywords, remember that Camlp4 can also delete rules, not only insert them).
So to sum up: when you declare a rule with a token "MyToken", the grammar is configured to recognize a "MyToken" keyword.
Now the lexer produces IDENT (or SYMBOL for that matters). SYMBOLs are KEYWORDs by default. IDENTs become KEYWORDs if they match the keyword content.
So in our case, the lexer recognizes identifiers. If this identifier equals (case-insensitively speaking) "mytoken", we declare an IDENT "MyToken", which will be later recognized as the "MyToken" keyword (because the is_kwd test is case-sensitive).
Cheers,
Matthieu
>
> Thanks, Joel
>
> ---
> http://tinyco.de
> Mac, C++, OCaml
^ permalink raw reply [flat|nested] 36+ messages in thread
* [Caml-list] Re: camlp4 stream parser syntax
2009-03-08 13:33 ` Re : " Matthieu Wipliez
@ 2009-03-08 13:59 ` Joel Reymont
2009-03-08 14:09 ` Re : " Matthieu Wipliez
0 siblings, 1 reply; 36+ messages in thread
From: Joel Reymont @ 2009-03-08 13:59 UTC (permalink / raw)
To: Matthieu Wipliez; +Cc: O'Caml Mailing List
On Mar 8, 2009, at 1:33 PM, Matthieu Wipliez wrote:
> So to sum up: when you declare a rule with a token "MyToken", the
> grammar is configured to recognize a "MyToken" keyword.
The issue here is that it must be lower case in the camlp4 rules, i.e.
"mytoken".
What if I want to have "MyToken" (camel-case) in the rule and have it
be low-cased when the grammar is extended? I think that requires
extending one of the Camlp4 modules or it won't work.
Also, using is not directly accessible and neither is the keywords
hash table or is_kwd. You _can_ get the filter with get_filter () but
the resulting structure is not mutable so you can't wrap is_kwd to low-
case the string passed to it.
---
http://tinyco.de
Mac, C++, OCaml
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re : [Caml-list] Re: camlp4 stream parser syntax
2009-03-08 13:59 ` Joel Reymont
@ 2009-03-08 14:09 ` Matthieu Wipliez
2009-03-08 14:30 ` Joel Reymont
0 siblings, 1 reply; 36+ messages in thread
From: Matthieu Wipliez @ 2009-03-08 14:09 UTC (permalink / raw)
To: O'Caml Mailing List
> > So to sum up: when you declare a rule with a token "MyToken", the grammar is
> configured to recognize a "MyToken" keyword.
>
> The issue here is that it must be lower case in the camlp4 rules, i.e.
> "mytoken".
Why "it must"? You need it to be lower-case? Or parsing does not work if it is not lower-case?
Maybe I did not understand correctly what you want...
I thought you wanted to recognize
BuyOrSell something
buyORsell something
using a single rule, say
buy : [ [ "buyOrSell"; ... ] ]
If that is the case, I think my solution works.
You might even do that:
buy : [ [ "buy_or_sell"; ... ] ]
and at lexing time do
if String.lowercase s = "buyorsell" then
IDENT "buy_or_sell"
else
IDENT s
In this case it is more than a matter of case, but the argument is still valid: I have declared a rule with "buy_or_sell", so the rule will be taken when a "buy_or_sell" keyword is found, and the lexer produces "buy_or_sell" identifiers from anything that matches case-insensitively "BuyOrSell".
Cheers,
Matthieu
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Re : [Caml-list] Re: camlp4 stream parser syntax
2009-03-08 14:09 ` Re : " Matthieu Wipliez
@ 2009-03-08 14:30 ` Joel Reymont
2009-03-08 15:07 ` Re : " Matthieu Wipliez
0 siblings, 1 reply; 36+ messages in thread
From: Joel Reymont @ 2009-03-08 14:30 UTC (permalink / raw)
To: Matthieu Wipliez; +Cc: O'Caml Mailing List
On Mar 8, 2009, at 2:09 PM, Matthieu Wipliez wrote:
> using a single rule, say
> buy : [ [ "buyOrSell"; ... ] ]
Yes, I want camel-case above.
> and at lexing time do
> if String.lowercase s = "buyorsell" then
> IDENT "buy_or_sell"
> else
> IDENT s
And this is the part that I object to. I have quite a number of
keywords and I don't want to have a bunch of if statements or have a
hash table mapping lowercase to camel case. This would mean having to
track the parser (camel case) version in two places: the lexer and the
parser.
What I want is to extend Camlp4.Struct.Grammar.Static with a custom
version of Make that applies String.lowercase before giving the string
to 'using' to be inserted into the keywords table.
Thanks, Joel
---
http://tinyco.de
Mac, C++, OCaml
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re : Re : [Caml-list] Re: camlp4 stream parser syntax
2009-03-08 14:30 ` Joel Reymont
@ 2009-03-08 15:07 ` Matthieu Wipliez
2009-03-08 15:24 ` Joel Reymont
0 siblings, 1 reply; 36+ messages in thread
From: Matthieu Wipliez @ 2009-03-08 15:07 UTC (permalink / raw)
To: O'Caml Mailing List
> And this is the part that I object to. I have quite a number of keywords and I
> don't want to have a bunch of if statements or have a hash table mapping
> lowercase to camel case. This would mean having to track the parser (camel case)
> version in two places: the lexer and the parser.
Ahhh ok, I (finally) got it!
I believe there is a (partially acceptable) solution, if you are willing to accept having all your keywords in lower-case in the grammar (not in the lexer), ie you match against "buyorsell", "sellshort" etc.
Then you can change the functions match_keyword and keyword_conversion as follows:
let keyword_conversion tok is_kwd =
match tok with
SYMBOL s | IDENT s when is_kwd (String.lowercase s) -> KEYWORD s
| _ -> tok
This will pass lower-cased identifiers to "is_kwd", so "BuyOrSell" becomes a valid keyword.
let match_keyword kwd = function
KEYWORD kwd' when kwd = String.lowercase kwd' -> true
| _ -> false
Here kwd is the keyword from the grammar ("buyorsell") and kwd' is the content of the keyword produced by the lexer ("BuyOrSell"), and they match.
Cheers,
Matthieu
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Re : Re : [Caml-list] Re: camlp4 stream parser syntax
2009-03-08 15:07 ` Re : " Matthieu Wipliez
@ 2009-03-08 15:24 ` Joel Reymont
2009-03-08 15:32 ` Re : " Matthieu Wipliez
0 siblings, 1 reply; 36+ messages in thread
From: Joel Reymont @ 2009-03-08 15:24 UTC (permalink / raw)
To: Matthieu Wipliez; +Cc: O'Caml Mailing List
On Mar 8, 2009, at 3:07 PM, Matthieu Wipliez wrote:
> I believe there is a (partially acceptable) solution, if you are
> willing to accept having all your keywords in lower-case in the
> grammar (not in the lexer), ie you match against "buyorsell",
> "sellshort" etc.
Nope, I want camel case! :D I think a functor or something like that
is called for here. There must be a way to include Structure into a
module to redefine 'using', without having to duplicate
Camlp4.Struct.Grammar.Static.Make!
The problem is that Static includes Structure.
I haven't figured out a solution yet.
I already downcase the idents in the lexer, what I want is to use
camel case in the camlp4 parser and have that be stored as lower case
in the internal hash table.
Thanks, Joel
---
http://tinyco.de
Mac, C++, OCaml
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re : Re : Re : [Caml-list] Re: camlp4 stream parser syntax
2009-03-08 15:24 ` Joel Reymont
@ 2009-03-08 15:32 ` Matthieu Wipliez
2009-03-08 15:39 ` Joel Reymont
2009-03-08 15:46 ` Joel Reymont
0 siblings, 2 replies; 36+ messages in thread
From: Matthieu Wipliez @ 2009-03-08 15:32 UTC (permalink / raw)
To: Joel Reymont; +Cc: O'Caml Mailing List
> > I believe there is a (partially acceptable) solution, if you are willing to
> accept having all your keywords in lower-case in the grammar (not in the lexer),
> ie you match against "buyorsell", "sellshort" etc.
>
> Nope, I want camel case! :D
lol ok :-)
> I think a functor or something like that is called
> for here. There must be a way to include Structure into a module to redefine
> 'using', without having to duplicate Camlp4.Struct.Grammar.Static.Make!
>
> The problem is that Static includes Structure.
I'd say duplicate Static, and redefine "using". Seems like the simplest solution to me, certainly not the cleanest though (but is there an alternative?).
Cheers,
Matthieu
^ permalink raw reply [flat|nested] 36+ messages in thread
* [Caml-list] Re: camlp4 stream parser syntax
2009-03-08 15:32 ` Re : " Matthieu Wipliez
@ 2009-03-08 15:39 ` Joel Reymont
2009-03-08 15:46 ` Joel Reymont
1 sibling, 0 replies; 36+ messages in thread
From: Joel Reymont @ 2009-03-08 15:39 UTC (permalink / raw)
To: Matthieu Wipliez; +Cc: O'Caml Mailing List
On Mar 8, 2009, at 3:32 PM, Matthieu Wipliez wrote:
> I'd say duplicate Static, and redefine "using". Seems like the
> simplest solution to me, certainly not the cleanest though (but is
> there an alternative?).
Now we are talking!
This is Static.ml:
module Make (Lexer : Sig.Lexer)
: Sig.Grammar.Static with module Loc = Lexer.Loc
and module Token = Lexer.Token
= struct
module Structure = Structure.Make Lexer;
module Delete = Delete.Make Structure;
module Insert = Insert.Make Structure;
module Fold = Fold.Make Structure;
include Structure;
...
value get_filter () = gram.gfilter;
...
value extend = Insert.extend;
end;
I read the documentation for 'include' but couldn't quite grasp
whether the included interface was exported from that module that's
including. Given that 'get_filter' is available but 'using', I reckon
the answer is NO.
What if Static1 included Static after making it, then included
Structure again and defined its own using in terms of the one provided
by Structure?
---
http://tinyco.de
Mac, C++, OCaml
^ permalink raw reply [flat|nested] 36+ messages in thread
* [Caml-list] Re: camlp4 stream parser syntax
2009-03-08 15:32 ` Re : " Matthieu Wipliez
2009-03-08 15:39 ` Joel Reymont
@ 2009-03-08 15:46 ` Joel Reymont
2009-03-08 15:55 ` Re : " Matthieu Wipliez
1 sibling, 1 reply; 36+ messages in thread
From: Joel Reymont @ 2009-03-08 15:46 UTC (permalink / raw)
To: Matthieu Wipliez; +Cc: O'Caml Mailing List
On Mar 8, 2009, at 3:32 PM, Matthieu Wipliez wrote:
> I'd say duplicate Static, and redefine "using". Seems like the
> simplest solution to me, certainly not the cleanest though (but is
> there an alternative?).
I don't think this will work elegantly.
Static first makes a Structure (is make the right term?) and then
makes a bunch of other modules using it. A custom Structure will be
needed to downcase the keywords before inserting them into the hash
table, so Static will need to be duplicated as well.
I'm learning modules, functors, etc. Perhaps someone more experienced
in this and camlp4 can weight in.
Thanks, Joel
-- Static.ml ---
module Make (Lexer : Sig.Lexer)
: Sig.Grammar.Static with module Loc = Lexer.Loc
and module Token = Lexer.Token
= struct
module Structure = Structure.Make Lexer;
module Delete = Delete.Make Structure;
module Insert = Insert.Make Structure;
module Fold = Fold.Make Structure;
include Structure;
---
http://tinyco.de
Mac, C++, OCaml
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re : [Caml-list] Re: camlp4 stream parser syntax
2009-03-08 15:46 ` Joel Reymont
@ 2009-03-08 15:55 ` Matthieu Wipliez
2009-03-08 16:58 ` Joel Reymont
0 siblings, 1 reply; 36+ messages in thread
From: Matthieu Wipliez @ 2009-03-08 15:55 UTC (permalink / raw)
To: O'Caml Mailing List
[-- Attachment #1: Type: text/plain, Size: 1452 bytes --]
> I don't think this will work elegantly.
>
> Static first makes a Structure (is make the right term?) and then makes a bunch
> of other modules using it. A custom Structure will be needed to downcase the
> keywords before inserting them into the hash table, so Static will need to be
> duplicated as well.
Well I just duplicated Static to Static1 (and added Camlp4.Struct.Grammar where necessary) and replaced:
module Structure = Camlp4.Struct.Grammar.Structure.Make Lexer;
by:
module Structure = struct
include Camlp4.Struct.Grammar.Structure.Make Lexer;
value using { gkeywords = table; gfilter = filter } kwd =
let kwd = String.lowercase kwd in
let r = try Hashtbl.find table kwd with
[ Not_found ->
let r = ref 0 in do { Hashtbl.add table kwd r; r } ]
in do { Token.Filter.keyword_added filter kwd (r.val = 0);
incr r };
end;
This way, I redefine "using" to my liking, the only modification being the lower-casing on the first line.
Structure is then passed to other functors as usual.
Note that you need to compile Static1 with camlp4r because it is revised syntax (in ocamlbuild _tags this is camlp4r, use_camlp4).
This seems to work (you need the lowercase in match_keyword too btw): I have "acTIon" and "actiON" in the parser, and parses "action" in input files.
Cheers,
Matthieu
[-- Attachment #2: Static1.ml --]
[-- Type: application/octet-stream, Size: 3481 bytes --]
(****************************************************************************)
(* *)
(* Objective Caml *)
(* *)
(* INRIA Rocquencourt *)
(* *)
(* Copyright 2006 Institut National de Recherche en Informatique et *)
(* en Automatique. All rights reserved. This file is distributed under *)
(* the terms of the GNU Library General Public License, with the special *)
(* exception on linking described in LICENSE at the top of the Objective *)
(* Caml source tree. *)
(* *)
(****************************************************************************)
(* Authors:
* - Daniel de Rauglaudre: initial version
* - Nicolas Pouillard: refactoring
*)
open Camlp4;
value uncurry f (x,y) = f x y;
value flip f x y = f y x;
module Make (Lexer : Sig.Lexer)
: Sig.Grammar.Static with module Loc = Lexer.Loc
and module Token = Lexer.Token
= struct
module Structure = struct
include Camlp4.Struct.Grammar.Structure.Make Lexer;
value using { gkeywords = table; gfilter = filter } kwd =
let kwd = String.lowercase kwd in
let r = try Hashtbl.find table kwd with
[ Not_found ->
let r = ref 0 in do { Hashtbl.add table kwd r; r } ]
in do { Token.Filter.keyword_added filter kwd (r.val = 0);
incr r };
end;
module Delete = Camlp4.Struct.Grammar.Delete.Make Structure;
module Insert = Camlp4.Struct.Grammar.Insert.Make Structure;
module Fold = Camlp4.Struct.Grammar.Fold.Make Structure;
include Structure;
value gram =
let gkeywords = Hashtbl.create 301 in
{
gkeywords = gkeywords;
gfilter = Token.Filter.mk (Hashtbl.mem gkeywords);
glexer = Lexer.mk ();
warning_verbose = ref True; (* FIXME *)
error_verbose = Camlp4_config.verbose
};
module Entry = struct
module E = Camlp4.Struct.Grammar.Entry.Make Structure;
type t 'a = E.t 'a;
value mk = E.mk gram;
value of_parser name strm = E.of_parser gram name strm;
value setup_parser = E.setup_parser;
value name = E.name;
value print = E.print;
value clear = E.clear;
value dump = E.dump;
value obj x = x;
end;
value get_filter () = gram.gfilter;
value lex loc cs = gram.glexer loc cs;
value lex_string loc str = lex loc (Stream.of_string str);
value filter ts = Token.Filter.filter gram.gfilter ts;
value parse_tokens_after_filter entry ts = Entry.E.parse_tokens_after_filter entry ts;
value parse_tokens_before_filter entry ts = parse_tokens_after_filter entry (filter ts);
value parse entry loc cs = parse_tokens_before_filter entry (lex loc cs);
value parse_string entry loc str = parse_tokens_before_filter entry (lex_string loc str);
value delete_rule = Delete.delete_rule;
value srules e rl =
Stree (List.fold_left (flip (uncurry (Insert.insert_tree e))) DeadEnd rl);
value sfold0 = Fold.sfold0;
value sfold1 = Fold.sfold1;
value sfold0sep = Fold.sfold0sep;
(* value sfold1sep = Fold.sfold1sep; *)
value extend = Insert.extend;
end;
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Re : [Caml-list] Re: camlp4 stream parser syntax
2009-03-08 15:55 ` Re : " Matthieu Wipliez
@ 2009-03-08 16:58 ` Joel Reymont
2009-03-08 17:04 ` Re : " Matthieu Wipliez
0 siblings, 1 reply; 36+ messages in thread
From: Joel Reymont @ 2009-03-08 16:58 UTC (permalink / raw)
To: Matthieu Wipliez; +Cc: O'Caml Mailing List
On Mar 8, 2009, at 3:55 PM, Matthieu Wipliez wrote:
> Well I just duplicated Static to Static1 (and added
> Camlp4.Struct.Grammar where necessary) and replaced:
> module Structure = Camlp4.Struct.Grammar.Structure.Make Lexer;
> by:
Something like this you mean? I must be doing something wrong as I
never see my printout from 'using'.
Thanks, Joel
--- Static1.ml ---
open Camlp4;
open Struct;
open Grammar;
value uncurry f (x,y) = f x y;
value flip f x y = f y x;
module Make (Lexer : Sig.Lexer)
: Sig.Grammar.Static with module Loc = Lexer.Loc
and module Token = Lexer.Token
= struct
module Structure = struct
include Camlp4.Struct.Grammar.Structure.Make Lexer;
value using { gkeywords = table; gfilter = filter } kwd =
let _ = print_endline ("using: storing " ^ String.lowercase
kwd) in
let kwd = String.lowercase kwd in
let r = try Hashtbl.find table kwd with
[ Not_found ->
let r = ref 0 in do { Hashtbl.add table kwd r; r } ]
in do { Token.Filter.keyword_added filter kwd (r.val = 0);
incr r };
end;
module Delete = Delete.Make Structure;
module Insert = Insert.Make Structure;
module Fold = Fold.Make Structure;
include Structure;
value gram =
let gkeywords = Hashtbl.create 301 in
{
gkeywords = gkeywords;
gfilter = Token.Filter.mk (Hashtbl.mem gkeywords);
glexer = Lexer.mk ();
warning_verbose = ref True; (* FIXME *)
error_verbose = Camlp4_config.verbose
};
module Entry = struct
module E = Entry.Make Structure;
type t 'a = E.t 'a;
value mk = E.mk gram;
value of_parser name strm = E.of_parser gram name strm;
value setup_parser = E.setup_parser;
value name = E.name;
value print = E.print;
value clear = E.clear;
value dump = E.dump;
value obj x = x;
end;
value get_filter () = gram.gfilter;
value lex loc cs = gram.glexer loc cs;
value lex_string loc str = lex loc (Stream.of_string str);
value filter ts = Token.Filter.filter gram.gfilter ts;
value parse_tokens_after_filter entry ts =
Entry.E.parse_tokens_after_filter entry ts;
value parse_tokens_before_filter entry ts =
parse_tokens_after_filter entry (filter ts);
value parse entry loc cs = parse_tokens_before_filter entry (lex
loc cs);
value parse_string entry loc str = parse_tokens_before_filter entry
(lex_string loc str);
value delete_rule = Delete.delete_rule;
value srules e rl =
Stree (List.fold_left (flip (uncurry (Insert.insert_tree e)))
DeadEnd rl);
value sfold0 = Fold.sfold0;
value sfold1 = Fold.sfold1;
value sfold0sep = Fold.sfold0sep;
(* value sfold1sep = Fold.sfold1sep; *)
value extend = Insert.extend;
end;
---
http://tinyco.de
Mac, C++, OCaml
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re : Re : [Caml-list] Re: camlp4 stream parser syntax
2009-03-08 16:58 ` Joel Reymont
@ 2009-03-08 17:04 ` Matthieu Wipliez
2009-03-08 17:15 ` Joel Reymont
0 siblings, 1 reply; 36+ messages in thread
From: Matthieu Wipliez @ 2009-03-08 17:04 UTC (permalink / raw)
To: O'Caml Mailing List
> > Well I just duplicated Static to Static1 (and added Camlp4.Struct.Grammar
> where necessary) and replaced:
> > module Structure = Camlp4.Struct.Grammar.Structure.Make Lexer;
> > by:
>
> Something like this you mean? I must be doing something wrong as I never see my
> printout from 'using'.
In the parser, did you replace
module Gram = Camlp4.Struct.Grammar.Static.Make(Lexer)
by
module Gram = Static1.Make(Lexer)
Because it works fine for me.
Cheers,
Matthieu
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Re : Re : [Caml-list] Re: camlp4 stream parser syntax
2009-03-08 17:04 ` Re : " Matthieu Wipliez
@ 2009-03-08 17:15 ` Joel Reymont
0 siblings, 0 replies; 36+ messages in thread
From: Joel Reymont @ 2009-03-08 17:15 UTC (permalink / raw)
To: Matthieu Wipliez; +Cc: O'Caml Mailing List
On Mar 8, 2009, at 5:04 PM, Matthieu Wipliez wrote:
> In the parser, did you replace
> module Gram = Camlp4.Struct.Grammar.Static.Make(Lexer)
> by
> module Gram = Static1.Make(Lexer)
I forgot to fix match_keyword. Works otherwise, thanks!
Now, why is match_keyword supplied with the original keyword, e.g.
"Delay" when the lower case version of that is supposed to be inserted
into the hash table?
---
http://tinyco.de
Mac, C++, OCaml
^ permalink raw reply [flat|nested] 36+ messages in thread
end of thread, other threads:[~2009-03-08 17:15 UTC | newest]
Thread overview: 36+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-03-07 22:38 camlp4 stream parser syntax Joel Reymont
2009-03-07 22:52 ` Joel Reymont
2009-03-07 23:21 ` Re : [Caml-list] " Matthieu Wipliez
2009-03-07 23:42 ` Joel Reymont
2009-03-08 0:40 ` Joel Reymont
2009-03-08 1:08 ` Re : " Matthieu Wipliez
2009-03-08 8:25 ` Joel Reymont
2009-03-08 9:37 ` Daniel de Rauglaudre
2009-03-08 9:51 ` Joel Reymont
2009-03-08 10:27 ` Daniel de Rauglaudre
2009-03-08 10:35 ` Joel Reymont
2009-03-08 11:07 ` Joel Reymont
2009-03-08 11:28 ` Daniel de Rauglaudre
2009-03-08 11:45 ` Re : Re : " Matthieu Wipliez
2009-03-08 11:52 ` Joel Reymont
2009-03-08 13:33 ` Re : " Matthieu Wipliez
2009-03-08 13:59 ` Joel Reymont
2009-03-08 14:09 ` Re : " Matthieu Wipliez
2009-03-08 14:30 ` Joel Reymont
2009-03-08 15:07 ` Re : " Matthieu Wipliez
2009-03-08 15:24 ` Joel Reymont
2009-03-08 15:32 ` Re : " Matthieu Wipliez
2009-03-08 15:39 ` Joel Reymont
2009-03-08 15:46 ` Joel Reymont
2009-03-08 15:55 ` Re : " Matthieu Wipliez
2009-03-08 16:58 ` Joel Reymont
2009-03-08 17:04 ` Re : " Matthieu Wipliez
2009-03-08 17:15 ` Joel Reymont
2009-03-08 9:34 ` Joel Reymont
2009-03-07 23:52 ` [Caml-list] " Jon Harrop
2009-03-07 23:53 ` Joel Reymont
2009-03-08 0:12 ` Jon Harrop
2009-03-08 0:20 ` Re : " Matthieu Wipliez
2009-03-08 0:29 ` Jon Harrop
2009-03-08 0:30 ` Re : " Joel Reymont
2009-03-08 0:37 ` Re : " Matthieu Wipliez
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox