From: John Prevost <prevost@maya.com>
Cc: caml-list@inria.fr
Subject: Camlp4's (lack of) hygiene (was Re: Macros)
Date: 06 Jul 2000 22:03:43 -0400 [thread overview]
Message-ID: <87d7kqitmo.fsf_-_@isil.localdomain> (raw)
In-Reply-To: Matthias Blume's message of "05 Jul 2000 21:24:22 +0900"
The following message is a courtesy copy of an article
that has been posted to comp.lang.functional as well.
I'm also forwarding this message to the caml-list for reference.
{ Summary: No, it's not hygienic, which I wasn't aware of. Have to
bug people to fix that. Detailed examples, and explanation of what
ways camlp4 is semi-hygienic and very powerful.
Summary of non-hygiene: It's hard not to capture variables in
subexpressions, if you declare temporaries. This should be fixed,
and it's probably pretty doable. Just make any variable reference
in a quotation be somehow gensymmed or rename variables in
subexpressions. }
>>>>> "mb" == Matthias Blume <see@my.sig> writes:
>>>>> "mm" == Markus Mottl <mottl@miss.wu-wien.ac.at> writes:
mm> I haven't needed camlp4 so far, but it is a pretty powerful
mm> tool: calling it a "preprocessor" is actually an
mm> underestimation of its capabilities.
mb> It "processes", and it does this before the compiler sees the
mb> program, hence "pre". That's a "pre-processor" to me.
mb> Still, is its handling of syntax-trees hygienic with respect
mb> to the Caml language?
mm> You can transform abstract syntax trees of the language with
mm> it and pretty print the result again.
mb> M4, being Turing-complete (AFAIK), can do the same. (Of
mb> course, I am not saying that the result would be beautiful in
mb> any sense of the word. :)
camlp4's pre-processing for ocaml works by using a parser with support
for extensible grammars. Here's an example from the manual:
--camlp4-manual---------------------------------------------------------
3.5 Examples
3.5.1 Arithmetic calculator
This is an example of a grammar of arithmetic expressions:
let gram = Grammar.create (Plexer.make ());;
let test = Grammar.Entry.create gram "expression";;
let expr = Grammar.Entry.create gram "expression";;
EXTEND
test: [ [ e = expr; EOI -> e ] ];
expr: [ "plus" LEFTA
[ e1 = expr; "+"; e2 = expr -> e1 + e2
| e1 = expr; "-"; e2 = expr -> e1 - e2 ]
| "mult" LEFTA
[ e1 = expr; "*"; e2 = expr -> e1 * e2
| e1 = expr; "/"; e2 = expr -> e1 / e2 ]
| [ e = INT -> int_of_string e
| "("; e = expr; ")" -> e ] ];
END;;
let calc str =
try Grammar.Entry.parse test (Stream.of_string str) with
Stdpp.Exc_located (loc, e) ->
Printf.printf "Located at (%d, %d)\n" (fst loc) (snd loc);
raise e
;;
Now, an extension of the entry ``expr'' to add the modulo, could be:
EXTEND expr:
AFTER "mult" [ [ e1 = expr; "mod"; e2 = expr -> e1 mod e2 ] ];
END;;
------------------------------------------------------------------------
the above is somewhat lex/yacc like. It's possible to replace the
lexer as well.
This works with ocaml by producing results which are ASTs. There are
camlp4 extensions used to write ASTs in a natural
real-caml-syntax-like way, but the underlying structure is a set of
strongly typed datastructures. For ocaml front-end use, a byte
representation of the AST (not a pretty-printed source file) is then
output for the compiler to digest.
Because of the strong typing, it is at the very least not possible to
write an extension that outputs grammatically incorrect code.
This is at least a level of hygiene that isn't seen in most
preprocessors. Again, extensions to the real parser, outputting a
type-safe AST.
Now, all of this is more than a little heavy. Here's an example
adding repeat <expr> until <expr>, which does what you'd expect.
--camlp4-Manual---------------------------------------------------------
4.3.2 Repeat until à la Pascal
The ``repeat...until'' loop of Pascal is closed to the ``while'' loop
except that it is executed at least once. We can implement it like
this:
open Pcaml;;
EXTEND
expr: LEVEL "let"
[[ "repeat"; e1 = expr; "until"; e2 = expr ->
<:expr< do $e1$; return while not $e2$ do $e1$; done >> ]];
END;;
------------------------------------------------------------------------
This says, essentially: "extend the current parser, add an entry at
the level named 'let' (same level as let expressions). Match the
keyword "repeat", follwed by an expression, the keyword "until", and
another expression, and use the resulting AST.
The <:expr< ... >> is a "quotation", which is an extension added in
camlp4 to allow selectively escaping out into other languages. The
$blah$ segments inside refer to variables in the ocaml code, rather
than the metalanguage code. So let's compare the "when" hygienic
macro example from R5RS to the equivalent camlp4 extension:
You might wonder, at this point, about introducing new variables
inside the output without capturing--which, I now recall, is the
really key thing about hygienic macros.
Here's the example which shows (the lack of) that property:
--camlp4-manual---------------------------------------------------------
4.3.1 Infix
This is an example to add the infix operator ``o'', composition of two
functions. For the meaning of the quotation expr used here, see
appendix A.
open Pcaml;;
EXTEND
expr: AFTER "apply"
[[ f = expr; "o"; g = expr -> <:expr< fun x -> $f$ ($g$ x) >> ]];
END;;
------------------------------------------------------------------------
You see here the $f$ and $g$ antiquoting out to get the values of f
and g. We also see "x", which is not antiquoted. Will this capture a
reference to x in f or g? Let's see:
isil $ ocamlc -pp 'camlp4o ./infix.cmo' testi.ml /home/prevost/src/caml/test
File "testi.ml", line 4, characters 23-28:
This expression has type 'a -> 'b but is here used with type 'a
isil $ camlp4o ./infix.cmo pr_o.cmo testi.ml /home/prevost/src/caml/test
let a y = y + 1;;
let x y = y + 2;;
Printf.printf "%d\n" ((fun x -> a (x x)) 0);;
Survey says: yes, it does interfere. Very disappointing--and I'll
have to ask about it on the caml list.
But, I will show a quick example of the same thing in hygienic macros
and in camlp4:
--R5RS------------------------------------------------------------------
(let-syntax ((when (syntax-rules ()
((when test stmt1 stmt2 ...)
(if test
(begin stmt1
stmt2 ...))))))
------------------------------------------------------------------------
Here's the camlp4 version:
--camlp4-example--------------------------------------------------------
open Pcaml
EXTEND
expr: LEVEL "top"
[[ "when"; e1 = expr; "do"; e2 = expr ->
<:expr< if $e1$ then $e2$ else () >> ]];
END
------------------------------------------------------------------------
(It turns out the documentation is a little out of date, and there's
no level named "let" any more.)
this then takes the program:
--camlp4-example-program------------------------------------------------
when true do print_string "test\n";;
------------------------------------------------------------------------
quite happily. Of course, in O'Caml you can have ifs with no elses,
which makes it a but more pointless.
Now--why all this power? Why not something simpler, like Scheme's
hygienic macros? The reason is that you can do more powerful
manipulations. As an example, I've seen camlp4 quotations for regular
expressions:
<:re<a*b*>>
or things which hoist constant expressions up to the top level so
they're only executed once. This kind of power is good.
So: hygienic? No. But that can be fixed. Powerful? Yes. In
actuality, I think you can actually avoid most collisions by using let
and:
open Pcaml;;
EXTEND
expr: AFTER "apply"
[[ f = expr; "o"; g = expr -> <:expr< let f = $f$ and g = $g$ in
fun x -> f (g x) >> ]];
END;;
except, of course, that this isn't the first obvious thing to do, and
this is not something that will work in all cases (i.e. the "my-or"
hygienic macro). Not only that, but there's no "gensym" function
here (at least none I know of.)
Oh--actually, this will in fact work in every case, if you make the
following transformation:
... <:expr< let f () = $f$ and g () = $g$ in fun x -> f () (g () x) >>
silly example, but it shows how you could use thunks to always avoid
capturing. Again, not immediately obvious that you need to.
So in any case--I'll bring this up with the developers. I wasn't
aware the problem was there.
John.
next parent reply other threads:[~2000-07-07 14:47 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <8js72h$11h$1@nnrp1.deja.com>
[not found] ` <Pine.LNX.4.21.0007041051140.20796-100000@punaluu.informatik.uni-freiburg.de>
[not found] ` <8juuep$420$1@news.planetinternet.be>
[not found] ` <u0lmzhyllz.fsf@hana.kurims.kyoto-u.ac.jp>
[not found] ` <8jv92l$qpb$1@bird.wu-wien.ac.at>
[not found] ` <u0wvj0datl.fsf@hana.kurims.kyoto-u.ac.jp>
2000-07-07 2:03 ` John Prevost [this message]
2000-07-07 23:42 ` John Prevost
2000-07-10 9:37 ` Daniel de Rauglaudre
2000-07-10 10:17 ` John Prevost
2000-07-10 11:42 ` Judicael Courant
2000-07-10 13:16 ` John Prevost
2000-07-17 10:08 ` Markus Mottl
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87d7kqitmo.fsf_-_@isil.localdomain \
--to=prevost@maya.com \
--cc=caml-list@inria.fr \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox