From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail1-relais-roc.national.inria.fr (mail1-relais-roc.national.inria.fr [192.134.164.82]) by walapai.inria.fr (8.13.6/8.13.6) with ESMTP id pBANZ4Ip030998 for ; Sun, 11 Dec 2011 00:35:04 +0100 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AkUBAAvs405KfVI0imdsb2JhbABDFpo6kCcIIgEBAQoJDQcSBiGBcgEBAQECAQwGAiwBGxIEBwEDAQsGBQsNDQgZIgERAQUBChIGEwgKAgcHh2YImCIKi2SCa4QaPYhxAgUMg2yET4MmBJRxjXE9g3o X-IronPort-AV: E=Sophos;i="4.71,332,1320620400"; d="scan'208";a="134863091" Received: from mail-ww0-f52.google.com ([74.125.82.52]) by mail1-smtp-roc.national.inria.fr with ESMTP/TLS/RC4-MD5; 11 Dec 2011 00:34:58 +0100 Received: by wgbdr12 with SMTP id dr12so10130390wgb.9 for ; Sat, 10 Dec 2011 15:34:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type:content-transfer-encoding; bh=jIjsYlcweAOacrsu8NjwWw9X3P+Jf22OFRRNKtguajo=; b=DE96bEZ4T33nBL1RtOBFGBu1JN+QW1E3NESb3BiJdps6OnKW/CyAlliqX7zcUxyA3f yeI4QLDW5wNj5ChBZ+cJCTAG8SazIYnJp57nRJgEyQ3tsM6bD6gegZnc6F+ZgtHdyOmK hxrpGAkdjoN8HWWbGL9pZP9o1geE6kut4125Q= Received: by 10.227.198.142 with SMTP id eo14mr10326042wbb.28.1323560098287; Sat, 10 Dec 2011 15:34:58 -0800 (PST) MIME-Version: 1.0 Received: by 10.227.43.4 with HTTP; Sat, 10 Dec 2011 15:34:37 -0800 (PST) In-Reply-To: References: <55531934-37A5-4CC5-AB67-20CE4CCE8269@googlemail.com> <4EE37070.4010702@inria.fr> <1323541702.32136.8.camel@aurora> <1323550544.32136.19.camel@aurora> From: Gabriel Scherer Date: Sun, 11 Dec 2011 00:34:37 +0100 Message-ID: To: Wojciech Meyer Cc: =?ISO-8859-1?Q?J=E9r=E9mie_Dimino?= , caml-list@inria.fr Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by walapai.inria.fr id pBANZ4Ip030998 Subject: Re: [Caml-list] Camlp4/p5 type reflection [was: OCaml maintenance status / community fork (again)] A summary to this lengthy mail: (1) Why type-enriched Camlp4 is an unreasonable idea (2) We should extract the typedtree; why it's hard (3) A fictional narrative of the camlp4/camlp5 history (4) Why you don't want to become Camlp4 maintainer (5) How we could try not to use Camlp4 in the future (6) Syntax extension survival advices # (1) Why type-enriched Camlp4 is an unreasonable idea Wojciech, your idea of having type information at the Camlp4 level is absolutely unreasonable. You are not speaking about a "minor change" here, but a major rewrite that would affect the compiler internals as well. It would really be a new (and interesting) project. Camlp4 is, and I guess will remain, a syntax-level preprocessing tool. You have to accept the fact that you can't use type information at this level (but you can certainly "interact" in some way with the type system by producing/transforming pieces of code in a way that you know will have interesting typing effects; for example, you may want to generate code that is purposedly ill-typed in some cases, to disallow certain uses of your syntax extension). I'm not even sure what it would mean to access type information at the camlp4 level, as you're producing and transforming untyped AST; would you want partially typed ASTs? How is the typer supposed to work on the part that you haven't transformed yet, and therefore are not valid OCaml syntax? I suppose you could have a "preprocessing and transformation" tool at the typedtree level, but that would be a different tool with different uses, distinct from the syntactic preprocessing part (though you may develop "extensions" that act on both fronts). I'm not aware of so much Camlp4 situations that would really require typing information. I would be interested in good examples if you have some. One problem that I have had with Camlp4 is that you don't have identifier resolution information (eg. you don't know if the identifier "(@)" you're seeing is really list concatenation, or has been redefined/shadowed in the context); this makes uses of Camlp4 for inlining, for example, quirky and fragile. That's still a simpler problem than type information. # (2) We should extract the typedtree; why it's hard If you really want to play with type information and scope-resolved identifiers, AST-manipulating tools is probably not the way to go: you indeed want full access to the typedtree. Currently this is only possible by hacking the compiler, and this is what for example Jun Furuse's Ocamlspotter project does. Those kind of tools could be made less intrusive if it was possible to pass typedtree-like information in and out of the compiler. I remember reading that some people (OcamlPro, I suppose) have this on their target list. The problem however is that the current internal compiler's typedtree representation is not at all adapted for external communication. If you want a kind of tool that is robust and future-proof in any sense (you could probably get something working by just marshalling the current typedtree, but then it could break awfully after minor language changes, make the compiler choke, etc.; I certainly wouldn't want to use that), you have to design a clean and efficient representation for OCaml programs after the type inference phase. Having a solid proposal on this topic wil be an awful lot of work. (3) A fictional narrative of the camlp4/camlp5 history Jérémie Dimino wrote: > But there is something I don't understand here. Why is there camlp4 and > camlp5 ? These two projects do exactly the same thing and are > incompatible. So i don't see the point of maintaining them both. We > should at least deprecate one. Let me repeat the story as I know it (with possible mistakes, I was still a caml baby in the <3.10 times) in a hopefully compact form for those on the list who have no idea about it. DISCLAIMER: this is only a fictional storytelling, meant to give a reasonable idea (or at least my vision) of the situation. I may be wrong about the events chronology, people name, hard facts, and of course english spelling and grammar. The story is complicated and I don't know the gory details. If you know a better story, feel free to add important precisions, correct the obvious mistakes, etc. I also welcome suggestions to make it a funny, entertaining read; finally, a few romantic details could clearly turn it into a blockbuster. The original Camlp4 tool was mostly developped by Daniel de Rauglaudre. Apparently, personal relations between Daniel and the OCaml team were not easy, and Camlp4 was gradually becoming more and more external to the OCaml distribution (in the past, the stream syntax was available as part of the core language, but it was moved to Camlp4; the Oreilly book was written before that move) and its maintainance status incertain. In the 3.09/3.10 transition, Nicolas Pouillard started working on a refactoring of the Camlp4 codebase (which was mostly a silent, non-moving animal at that time) to make it more easy to evolve and maintain. The refactoring maybe went "a bit too far", in that it brought a number of changes to the external interface and, in particular, broke existing Camlp4 extension, as well as the (quite good) Camlp4 documentation. Of course they also were advantages, in that the new design was modular and, for example, the bootstrapping process was made easier, and Nicolas Pouillard could maintain the tool as offered by the distribution. The 3.10 transition was however very painful for people using existing Camlp4 extensions (I'm thinking of eg. Martin Jambon, which had extensive Camlp4 extensions, and the Coq team which has user-defined notations using Camlp4 and, huh, I really don't want to know the details); basically they didn't upgrade to 3.10 -- instead of porting the extensions, as was originally hoped. Personal note: I learned Camlp4 in this period, just after the release of OCaml 3.10. I'm honestly unaware of how pre-3.10 Camlp4 was (though I guess it is not too difficult to move from one to the other) so my interpretations of previous times are all based on reading the mailing-list. Daniel, which apparently did not agree with some of the changes made, relatively suddenly restarted developpment of "his" branch of Camlp4, taken from the old sources, before refactoring. This was done as a separate project, outside the OCaml distribution (apparently Daniel and the OCaml team prefer not to work together). In a quite creative move, he named his version "camlp5" so that it could be easily distinguished from the "upstream camlp4", the incompatible new version being distributed with OCaml. Camlp5 is therefore the continuation of the *old* camlp4. Development, however, continued (while still preserving or mostly preserving compatibility with pre-3.10 extensions, ensuring grateful thanks from the users) with non-neglectible changes (eg. addition of a library of non-destructive streams), and is currently ongoing (as is maintainance of pre-3.10 camlp4 in the OCaml distribution). http://cristal.inria.fr/~ddr/camlp5/CHANGES Of the projects that relied on Camlp4 before 3.10, some of them ported they extension to >=3.10 Camlp4, and lived happily ever after, and some of them jumped on camlp5 as a lower-cost migration opportunity and, I suppose, also lived happily ever after; or at least, as happily as you can knowing that your codebase depends on camlp{4,5}. I make no technical judgments of which version (Camlp4 or Camlp5) is "better". I know and (try not to) use Camlp4. I also understand the position of Camlp5 users for compatibility reasons. I would advise newcomers to try Camlp4 first as it has a larger user base (being distributed with the OCaml distribution). Anyway, see at the end of this mail why you may not want to use camlp* anyway. # (4) Why you don't want to become Camlp4 maintainer > I am volunteer for the maintenance of Camlp4. Jérémie, I am deeply impressed by your sense of sacrifice. Camlp4 is a piece of devil beauty. It does incredibly clever things, and is incredibly complex inside: Daniel is clearly a remarkable hacker, but his code is not easy to understand. I know that maintaining the whole thing is very hard; and that is the reason why Camlp4 tends to have problems to bump from one version to another, when non-neglectible syntaxic changes are made to the language. The core internals of Camlp4 are quite imperative: parsing is done by erasing tokens from the input stream (which makes for lovely Stream debugging), but, more importantly, grammar extensions are destructive mutations of the grammar. It is sometimes painful when using Camlp4, and I suppose it must be hell maintaining it. # (5) How we could try not to use Camlp4 in the future I think Alain's idea of moving out of Camlp4 is actually a good thing. After having used Camlp4 extensively, I came out with the general feeling that allowing general extension of the grammar is not a good thing, because it is too complex and the solutions are too fragile. On the contrary, the quotation mechanism of Camlp4 is a wonderful tool, that acts as a *controlled* extension point for the grammar, and is therefore much more robust. The other excellent use of Camlp4 is generating OCaml code out of "annotations" on certain AST nodes, as the 'type-conv' extension does; this is also a form of controlled extensibility that could, I think, be taken into account in the base language or a simple tool, and doesn't require full-fledged grammar mutation. I think we should isolate such restricted uses of syntax extensibility and allow them through simpler, more robust tools. Alain's idea, and the reactions from Camlp4 users (Martin Jambon, Nicolas Pouillard), can be found on his blog: http://www.lexifi.com/blog/syntax-extensions-without-camlp4 I'm personally not sure his solution is quite ready to replace Camlp4 yet; or at least the state in which it was at this time. He has an annotation mechanism, but it is somehow not restricted enough to guarantee a reasonable behaviour; as long as people are tempted to use "annotations" to define try..finally, we will have fragile extensions floating around. (And I'm not sure it's a good idea to move Camlp4 out of the distribution as long as we don't have a viable alternative proposal and some users have started moving to it. Camlp4 will still need to be supported by someone anyway, and it needs to evolve in lockstep with OCaml language changes.) Yet I do agree that "without Camlp4" is the future. It may take some time, but what I would personally like to see in a few years is an OCaml world where we don't need Camlp4 anymore, because we have other, simpler tools to do the reasonable things, and have learned to live without the rest. The whole "maintain Camlp4" entreprise will still be useful, necessary until that time, but it won't stay, I hope, at the center of attentions. # (6) Syntax extension survival advices To the reader considering use of a new syntax extension in his next project: - don't - if you really must, try to make sure that your code is also reasonable to use *without* a syntax extension (eg. by producing a library with a clean interface, making your extension desugar to uses of it, but also making sure that it can be used by the human user) - if you really must, try to get it in the form of a quotation; the rest is fragile - alternatively, try to branch yourself on existing "flexible" syntax extensions such as Markus Mottl's 'type-conv' and Jeremy Yallop's 'patterns': you are relatively safe if you don't write any line of code modifying OCaml's syntax yourself. http://hg.ocaml.info/release/type-conv http://code.google.com/p/ocaml-patterns/ - do not hesitate to send for code review and/or ask for help on the list - don't On Sat, Dec 10, 2011 at 10:40 PM, Wojciech Meyer wrote: > Jérémie Dimino writes: > >> Le samedi 10 décembre 2011 à 19:10 +0000, Wojciech Meyer a écrit : >>> I'm aware that these are huge changes to Camlp4, but it would make >>> meta programming more powerful and push Camlp4 to the next level. >> >> Sure. But it seems that the next version of OCaml will have runtime >> types, see http://www.lexifi.com/blog/runtime-types , so maybe it is not >> needed to add this to camlp4. > > It's interesting and I didn't know about it. However, the problem is > slightly different, I would like to know the typing of a freshly > generated piece of code by Camlp4 in the previous phase. Then, have > pattern matching against these meta types in annotated AST and produce > another AST, which in turn have most likely another typing and pass to > the next phase etc. I would say that Camlp4 is fine for the simpler > syntax extensions and majority of small DSLs but when you start composing > syntax extensions and macros it quickly becomes a problem. > >> >> Also they are problems that i don't know how to solve with the camlp4 >> approach. For example consider: >> >>   let x = 1 >>   type int = A >>   let y = A >> >> The typer knows that x has the type (int, 1) and y has the type (int, >> 42). But what you send to ocaml is a parse tree, and you cannot make >> this difference in the parse tree. > > Yes, you would need a type information in the parse tree as mentioned > before, so you want to feed up the compiler with AST start unrolling > macros top-down and then follow up with the inferred types bottom-up. > > Cheers; > Wojciech > > > -- > Caml-list mailing list.  Subscription management and archives: > https://sympa-roc.inria.fr/wws/info/caml-list > Beginner's list: http://groups.yahoo.com/group/ocaml_beginners > Bug reports: http://caml.inria.fr/bin/caml-bugs >