Mailing list for all users of the OCaml language and system.
 help / color / mirror / Atom feed
From: Jun Furuse <jun.furuse@gmail.com>
To: Gabriel Scherer <gabriel.scherer@gmail.com>
Cc: caml-list <caml-list@inria.fr>
Subject: Re: [Caml-list] Is it possible to extend OCaml lexer rules via Camlp4?
Date: Thu, 3 Nov 2011 16:12:29 +0900	[thread overview]
Message-ID: <CAAoLEWu6wL5twx2Dwgs5DSwwj2yAYyvoaZ7dtDKsdJB6=CaxjQ@mail.gmail.com> (raw)
In-Reply-To: <CAPFanBE2QYN54Zg-MYN9k3U63Vucy9M3CMbvf_c2bh5qw7Q0oQ@mail.gmail.com>

Gabriel,

Thanks for the info. But what I want cannot be achieved by the lex filter.

I want to have pcre regexp literals in the same syntax as Perl i.e.
/hello\sworld\\n/. Currently what we do in OCaml is Pcre.regexp
"hello\\sworld\\\\n", where the backslash char must be escaped in a
OCaml string literal. This is lousy for scripting in OCaml.

To have the same or similar syntax as Perl, the lexer must be really
modified. Currently I am using a modified CamlP4 where I can replace
its lexer function, but it is an adhoc way, and I am seeking any
healthier way without such a modification.

Jun

On Thu, Nov 3, 2011 at 7:52 AM, Gabriel Scherer
<gabriel.scherer@gmail.com> wrote:
>> I have tried to override whole the syntax as follows, but it seems
>> that it changes nothing...:
>
> Camlp4 is designed around mutable state. That you Make module produces
> a new grammar doesn't make it the current grammar used by the
> preprocessor. What need to happen is that the current state of the
> lexer/parser is *mutated* by your Make module (whose evaluation is
> then delayed and controlled by Camlp4 itself thanks to registration
> with Register). This is what EXTEND does at the grammar level.
>
> If it suits your need, you can define your modification as a filter
> that will post-process the output of the original lexer. The Token
> module expose a define_filter function to imperatively update the set
> of such active stream transformers.
>
> This idea was suggested to me by Jérémie Dimino, and works well. It is
> used for example in pa_comprehension to define "[?" and "?]" as new
> OCaml keywords (asking for "["; "?" at the camlp4 grammar level would
> allow spaces in between):
>  https://github.com/ocaml-batteries-team/batteries-included/blob/master/src/syntax/pa_comprehension/pa_comprehension.ml
>
> Below is the relevant code:
>
>  module Make (Syntax : Sig.Camlp4Syntax) = struct
>    open Sig;
>    include Syntax;
>
>
>    (* "[?" and "?]" are not recognized as delimiters by the Camlp4
>      lexer; This token parser will spot "["; "?" and "?"; "]" token
>      and insert "[?" and "?]" instead.
>     Thanks to Jérémie Dimino for the idea. *)
>    value rec delim_filter older_filter stream =
>      let rec filter = parser
>      [ [: `(KEYWORD "[", loc); rest :] ->
>          match rest with parser
>          [ [: `(KEYWORD "?", _) :] -> [: `(KEYWORD "[?", loc); filter rest :]
>          | [: :] -> [: `(KEYWORD "[", loc); filter rest :] ]
>      | [: `(KEYWORD "?", loc); rest :] ->
>          match rest with parser
>          [ [: `(KEYWORD "]", loc) :] -> [: `(KEYWORD "?]", loc); filter rest :]
>          | [: :] -> [: `(KEYWORD "?", loc); filter rest :] ]
>      | [: `other; rest :] -> [: `other; filter rest :] ] in
>      older_filter (filter stream);
>
>    value _ = Token.Filter.define_filter (Gram.get_filter ()) delim_filter;
>
>    (* REST OF THE CAMLP4 EXTENSION ... *)
>  end;
>
>
> On Wed, Nov 2, 2011 at 9:34 PM, Jun Furuse <jun.furuse@gmail.com> wrote:
>> Hi,
>>
>> Is it possible for Camlp4 to implement an OCaml syntax extension (i.e.
>> pa_*) which modifies the lexer of OCaml syntax?
>>
>> I have tried to override whole the syntax as follows, but it seems
>> that it changes nothing...:
>>
>> -----------------------------------------------------------
>> open Camlp4
>>
>> module Id : Sig.Id = struct
>>  let name = "pa_extlex"
>>  let version = "1.0"
>> end
>>
>> module XLexer = Xlexer.Make(PreCast.Token)        (* XLexer
>> reimplements OCaml lexer with some extra rules *)
>> module XGram = PreCast.MakeGram(XLexer)
>>
>> module Make (Syntax : Sig.Camlp4Syntax) = struct
>>  let _ = prerr_endline "Creating OCaml syntax with lexer extension"
>>  module M1 = OCamlInitSyntax.Make(PreCast.Ast)(XGram)(PreCast.Quotation)
>>  module M2 = Camlp4OCamlRevisedParser.Make(M1)
>>  module M3 = Camlp4OCamlParser.Make(M2)
>>  include M3
>> end
>>
>> let module M = Register.OCamlSyntaxExtension(Id)(Make) in ()
>> -----------------------------------
>>
>> Jun
>>
>> --
>> Caml-list mailing list.  Subscription management and archives:
>> https://sympa-roc.inria.fr/wws/info/caml-list
>> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
>> Bug reports: http://caml.inria.fr/bin/caml-bugs
>>
>>
>


  reply	other threads:[~2011-11-03  7:12 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-11-02 20:34 Jun Furuse
2011-11-02 22:52 ` Gabriel Scherer
2011-11-03  7:12   ` Jun Furuse [this message]
2011-11-03  9:16     ` Jérémie Dimino
2011-11-05 21:19       ` Nicolas Pouillard
2011-11-06  0:58         ` Jun Furuse

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAAoLEWu6wL5twx2Dwgs5DSwwj2yAYyvoaZ7dtDKsdJB6=CaxjQ@mail.gmail.com' \
    --to=jun.furuse@gmail.com \
    --cc=caml-list@inria.fr \
    --cc=gabriel.scherer@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox