* [Caml-list] Is it possible to extend OCaml lexer rules via Camlp4? @ 2011-11-02 20:34 Jun Furuse 2011-11-02 22:52 ` Gabriel Scherer 0 siblings, 1 reply; 6+ messages in thread From: Jun Furuse @ 2011-11-02 20:34 UTC (permalink / raw) To: caml-list Hi, Is it possible for Camlp4 to implement an OCaml syntax extension (i.e. pa_*) which modifies the lexer of OCaml syntax? I have tried to override whole the syntax as follows, but it seems that it changes nothing...: ----------------------------------------------------------- open Camlp4 module Id : Sig.Id = struct let name = "pa_extlex" let version = "1.0" end module XLexer = Xlexer.Make(PreCast.Token) (* XLexer reimplements OCaml lexer with some extra rules *) module XGram = PreCast.MakeGram(XLexer) module Make (Syntax : Sig.Camlp4Syntax) = struct let _ = prerr_endline "Creating OCaml syntax with lexer extension" module M1 = OCamlInitSyntax.Make(PreCast.Ast)(XGram)(PreCast.Quotation) module M2 = Camlp4OCamlRevisedParser.Make(M1) module M3 = Camlp4OCamlParser.Make(M2) include M3 end let module M = Register.OCamlSyntaxExtension(Id)(Make) in () ----------------------------------- Jun ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Caml-list] Is it possible to extend OCaml lexer rules via Camlp4? 2011-11-02 20:34 [Caml-list] Is it possible to extend OCaml lexer rules via Camlp4? Jun Furuse @ 2011-11-02 22:52 ` Gabriel Scherer 2011-11-03 7:12 ` Jun Furuse 0 siblings, 1 reply; 6+ messages in thread From: Gabriel Scherer @ 2011-11-02 22:52 UTC (permalink / raw) To: Jun Furuse; +Cc: caml-list > I have tried to override whole the syntax as follows, but it seems > that it changes nothing...: Camlp4 is designed around mutable state. That you Make module produces a new grammar doesn't make it the current grammar used by the preprocessor. What need to happen is that the current state of the lexer/parser is *mutated* by your Make module (whose evaluation is then delayed and controlled by Camlp4 itself thanks to registration with Register). This is what EXTEND does at the grammar level. If it suits your need, you can define your modification as a filter that will post-process the output of the original lexer. The Token module expose a define_filter function to imperatively update the set of such active stream transformers. This idea was suggested to me by Jérémie Dimino, and works well. It is used for example in pa_comprehension to define "[?" and "?]" as new OCaml keywords (asking for "["; "?" at the camlp4 grammar level would allow spaces in between): https://github.com/ocaml-batteries-team/batteries-included/blob/master/src/syntax/pa_comprehension/pa_comprehension.ml Below is the relevant code: module Make (Syntax : Sig.Camlp4Syntax) = struct open Sig; include Syntax; (* "[?" and "?]" are not recognized as delimiters by the Camlp4 lexer; This token parser will spot "["; "?" and "?"; "]" token and insert "[?" and "?]" instead. Thanks to Jérémie Dimino for the idea. *) value rec delim_filter older_filter stream = let rec filter = parser [ [: `(KEYWORD "[", loc); rest :] -> match rest with parser [ [: `(KEYWORD "?", _) :] -> [: `(KEYWORD "[?", loc); filter rest :] | [: :] -> [: `(KEYWORD "[", loc); filter rest :] ] | [: `(KEYWORD "?", loc); rest :] -> match rest with parser [ [: `(KEYWORD "]", loc) :] -> [: `(KEYWORD "?]", loc); filter rest :] | [: :] -> [: `(KEYWORD "?", loc); filter rest :] ] | [: `other; rest :] -> [: `other; filter rest :] ] in older_filter (filter stream); value _ = Token.Filter.define_filter (Gram.get_filter ()) delim_filter; (* REST OF THE CAMLP4 EXTENSION ... *) end; On Wed, Nov 2, 2011 at 9:34 PM, Jun Furuse <jun.furuse@gmail.com> wrote: > Hi, > > Is it possible for Camlp4 to implement an OCaml syntax extension (i.e. > pa_*) which modifies the lexer of OCaml syntax? > > I have tried to override whole the syntax as follows, but it seems > that it changes nothing...: > > ----------------------------------------------------------- > open Camlp4 > > module Id : Sig.Id = struct > let name = "pa_extlex" > let version = "1.0" > end > > module XLexer = Xlexer.Make(PreCast.Token) (* XLexer > reimplements OCaml lexer with some extra rules *) > module XGram = PreCast.MakeGram(XLexer) > > module Make (Syntax : Sig.Camlp4Syntax) = struct > let _ = prerr_endline "Creating OCaml syntax with lexer extension" > module M1 = OCamlInitSyntax.Make(PreCast.Ast)(XGram)(PreCast.Quotation) > module M2 = Camlp4OCamlRevisedParser.Make(M1) > module M3 = Camlp4OCamlParser.Make(M2) > include M3 > end > > let module M = Register.OCamlSyntaxExtension(Id)(Make) in () > ----------------------------------- > > Jun > > -- > Caml-list mailing list. Subscription management and archives: > https://sympa-roc.inria.fr/wws/info/caml-list > Beginner's list: http://groups.yahoo.com/group/ocaml_beginners > Bug reports: http://caml.inria.fr/bin/caml-bugs > > ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Caml-list] Is it possible to extend OCaml lexer rules via Camlp4? 2011-11-02 22:52 ` Gabriel Scherer @ 2011-11-03 7:12 ` Jun Furuse 2011-11-03 9:16 ` Jérémie Dimino 0 siblings, 1 reply; 6+ messages in thread From: Jun Furuse @ 2011-11-03 7:12 UTC (permalink / raw) To: Gabriel Scherer; +Cc: caml-list Gabriel, Thanks for the info. But what I want cannot be achieved by the lex filter. I want to have pcre regexp literals in the same syntax as Perl i.e. /hello\sworld\\n/. Currently what we do in OCaml is Pcre.regexp "hello\\sworld\\\\n", where the backslash char must be escaped in a OCaml string literal. This is lousy for scripting in OCaml. To have the same or similar syntax as Perl, the lexer must be really modified. Currently I am using a modified CamlP4 where I can replace its lexer function, but it is an adhoc way, and I am seeking any healthier way without such a modification. Jun On Thu, Nov 3, 2011 at 7:52 AM, Gabriel Scherer <gabriel.scherer@gmail.com> wrote: >> I have tried to override whole the syntax as follows, but it seems >> that it changes nothing...: > > Camlp4 is designed around mutable state. That you Make module produces > a new grammar doesn't make it the current grammar used by the > preprocessor. What need to happen is that the current state of the > lexer/parser is *mutated* by your Make module (whose evaluation is > then delayed and controlled by Camlp4 itself thanks to registration > with Register). This is what EXTEND does at the grammar level. > > If it suits your need, you can define your modification as a filter > that will post-process the output of the original lexer. The Token > module expose a define_filter function to imperatively update the set > of such active stream transformers. > > This idea was suggested to me by Jérémie Dimino, and works well. It is > used for example in pa_comprehension to define "[?" and "?]" as new > OCaml keywords (asking for "["; "?" at the camlp4 grammar level would > allow spaces in between): > https://github.com/ocaml-batteries-team/batteries-included/blob/master/src/syntax/pa_comprehension/pa_comprehension.ml > > Below is the relevant code: > > module Make (Syntax : Sig.Camlp4Syntax) = struct > open Sig; > include Syntax; > > > (* "[?" and "?]" are not recognized as delimiters by the Camlp4 > lexer; This token parser will spot "["; "?" and "?"; "]" token > and insert "[?" and "?]" instead. > Thanks to Jérémie Dimino for the idea. *) > value rec delim_filter older_filter stream = > let rec filter = parser > [ [: `(KEYWORD "[", loc); rest :] -> > match rest with parser > [ [: `(KEYWORD "?", _) :] -> [: `(KEYWORD "[?", loc); filter rest :] > | [: :] -> [: `(KEYWORD "[", loc); filter rest :] ] > | [: `(KEYWORD "?", loc); rest :] -> > match rest with parser > [ [: `(KEYWORD "]", loc) :] -> [: `(KEYWORD "?]", loc); filter rest :] > | [: :] -> [: `(KEYWORD "?", loc); filter rest :] ] > | [: `other; rest :] -> [: `other; filter rest :] ] in > older_filter (filter stream); > > value _ = Token.Filter.define_filter (Gram.get_filter ()) delim_filter; > > (* REST OF THE CAMLP4 EXTENSION ... *) > end; > > > On Wed, Nov 2, 2011 at 9:34 PM, Jun Furuse <jun.furuse@gmail.com> wrote: >> Hi, >> >> Is it possible for Camlp4 to implement an OCaml syntax extension (i.e. >> pa_*) which modifies the lexer of OCaml syntax? >> >> I have tried to override whole the syntax as follows, but it seems >> that it changes nothing...: >> >> ----------------------------------------------------------- >> open Camlp4 >> >> module Id : Sig.Id = struct >> let name = "pa_extlex" >> let version = "1.0" >> end >> >> module XLexer = Xlexer.Make(PreCast.Token) (* XLexer >> reimplements OCaml lexer with some extra rules *) >> module XGram = PreCast.MakeGram(XLexer) >> >> module Make (Syntax : Sig.Camlp4Syntax) = struct >> let _ = prerr_endline "Creating OCaml syntax with lexer extension" >> module M1 = OCamlInitSyntax.Make(PreCast.Ast)(XGram)(PreCast.Quotation) >> module M2 = Camlp4OCamlRevisedParser.Make(M1) >> module M3 = Camlp4OCamlParser.Make(M2) >> include M3 >> end >> >> let module M = Register.OCamlSyntaxExtension(Id)(Make) in () >> ----------------------------------- >> >> Jun >> >> -- >> Caml-list mailing list. Subscription management and archives: >> https://sympa-roc.inria.fr/wws/info/caml-list >> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners >> Bug reports: http://caml.inria.fr/bin/caml-bugs >> >> > ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Caml-list] Is it possible to extend OCaml lexer rules via Camlp4? 2011-11-03 7:12 ` Jun Furuse @ 2011-11-03 9:16 ` Jérémie Dimino 2011-11-05 21:19 ` Nicolas Pouillard 0 siblings, 1 reply; 6+ messages in thread From: Jérémie Dimino @ 2011-11-03 9:16 UTC (permalink / raw) To: Jun Furuse; +Cc: caml-list Hi, On Thu, Nov 03, 2011 at 04:12:29PM +0900, Jun Furuse wrote: > I want to have pcre regexp literals in the same syntax as Perl i.e. > /hello\sworld\\n/. Currently what we do in OCaml is Pcre.regexp > "hello\\sworld\\\\n", where the backslash char must be escaped in a > OCaml string literal. This is lousy for scripting in OCaml. Have you look at camlp4 quotations ? Basically you can define a new quotation named "foo" and in you code you can write: <:foo<...>> The ... can be any string, except that it cannot contains >>. Also you may be interested in the Mikmatch syntax extension: http://martin.jambon.free.fr/micmatch.html Cheers, -- Jérémie ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Caml-list] Is it possible to extend OCaml lexer rules via Camlp4? 2011-11-03 9:16 ` Jérémie Dimino @ 2011-11-05 21:19 ` Nicolas Pouillard 2011-11-06 0:58 ` Jun Furuse 0 siblings, 1 reply; 6+ messages in thread From: Nicolas Pouillard @ 2011-11-05 21:19 UTC (permalink / raw) To: Jun Furuse, caml-list On Thu, Nov 3, 2011 at 10:16 AM, Jérémie Dimino <jeremie@dimino.org> wrote: > Hi, > > On Thu, Nov 03, 2011 at 04:12:29PM +0900, Jun Furuse wrote: >> I want to have pcre regexp literals in the same syntax as Perl i.e. >> /hello\sworld\\n/. Currently what we do in OCaml is Pcre.regexp >> "hello\\sworld\\\\n", where the backslash char must be escaped in a >> OCaml string literal. This is lousy for scripting in OCaml. As said earlier Camlp4's lexer is not extensible. One can change the meaning of the token stream using the token filters but this won't work in your case. The third option is to use quotations this is really the adapted feature for this task. Of course the syntax won't be as concise as /bla/... Regarding OCaml lexing you may be interested in camllexer [1] which is not intended to be extensible but is very small and selfcontained. If you really want to hack your own lexical syntax I suggest you to fork camllexer and change it for your purpose. [1]: https://github.com/np/camllexer Best regards, -- Nicolas Pouillard http://nicolaspouillard.fr ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Caml-list] Is it possible to extend OCaml lexer rules via Camlp4? 2011-11-05 21:19 ` Nicolas Pouillard @ 2011-11-06 0:58 ` Jun Furuse 0 siblings, 0 replies; 6+ messages in thread From: Jun Furuse @ 2011-11-06 0:58 UTC (permalink / raw) To: Nicolas Pouillard; +Cc: caml-list Hi, Unfortunately the conclusion seems to be currently there is no way to change the lexer by pa_*.cmo modules. Then, I stick to my patched p4 approach for now. With it I can use $/regexp\n/i and $`find . -iname hoo` syntax, but for whom using the vanilla p4, they can still use <:m<regexp\n/i>> and <:qx<find . -iname hoo>> : https://bitbucket.org/camlspotter/orakuda/src/50d736f39428/test Thanks, Jun ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2011-11-06 0:58 UTC | newest] Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2011-11-02 20:34 [Caml-list] Is it possible to extend OCaml lexer rules via Camlp4? Jun Furuse 2011-11-02 22:52 ` Gabriel Scherer 2011-11-03 7:12 ` Jun Furuse 2011-11-03 9:16 ` Jérémie Dimino 2011-11-05 21:19 ` Nicolas Pouillard 2011-11-06 0:58 ` Jun Furuse
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox