From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on yquem.inria.fr X-Spam-Level: *** X-Spam-Status: No, score=3.7 required=5.0 tests=DNS_FROM_RFC_ABUSE, DNS_FROM_RFC_POST,DNS_FROM_RFC_WHOIS,FORGED_YAHOO_RCVD autolearn=disabled version=3.1.3 Received: from concorde.inria.fr (concorde.inria.fr [192.93.2.39]) by yquem.inria.fr (Postfix) with ESMTP id 71E20BC69 for ; Thu, 2 Nov 2006 05:35:09 +0100 (CET) Received: from out1.smtp.messagingengine.com (out1.smtp.messagingengine.com [66.111.4.25]) by concorde.inria.fr (8.13.6/8.13.6) with ESMTP id kA24Z8Qk006746 for ; Thu, 2 Nov 2006 05:35:09 +0100 Received: from db2.internal (db2.internal [10.202.2.12]) by frontend1.messagingengine.com (Postfix) with ESMTP id 849AADC01DF for ; Wed, 1 Nov 2006 23:35:00 -0500 (EST) Received: from heartbeat1.messagingengine.com ([10.202.2.160]) by db2.internal (MEProxy); Wed, 01 Nov 2006 23:35:03 -0500 X-Sasl-enc: p4CKFz4KM8+rASXA1x7lU1X8C7JP3WwWD/VpEjEdN5wA 1162442102 Received: from [192.168.0.2] (67-42-111-118.tukw.qwest.net [67.42.111.118]) by mail.messagingengine.com (Postfix) with ESMTP id B87121D103 for ; Wed, 1 Nov 2006 23:35:01 -0500 (EST) Message-ID: <45497575.3030708@yahoo.com> Date: Wed, 01 Nov 2006 20:35:01 -0800 From: Jeff Henrikson User-Agent: Mozilla Thunderbird 1.0.7 (Macintosh/20050923) X-Accept-Language: en-us, en MIME-Version: 1.0 To: caml-list@inria.fr Subject: augmented camlp4 ASTs Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-j-chkmail-Score: MSGID : 4549757C.001 on concorde : j-chkmail score : X : 0/20 1 X-Miltered: at concorde with ID 4549757C.001 by Joe's j-chkmail (http://j-chkmail.ensmp.fr)! X-Spam: no; 0.00; henrikson:01 jehenrik:01 camlp:01 camlp:01 parsing:01 mlast:01 expr:01 mlast:01 one-to-one:01 sexp:01 sexp:01 infix:01 expr:01 parser:01 substituted:01 Hello, Trying to measure the feasibility/cost of doing a couple of things with camlp4. I like the camlp4 capability of adding constructions to the grammar. However, I sometimes want to do nonlocal transformations on the contents, and the fact that I always must immediately return an expression on the right hand side bothers me. I can call the parsing part of camlp4 to get an MLast.expr, but the MLast grammar is fixed and cannot have productions added to it. One way to get an extensible grammar would be to project one-to-one into (not onto) a looser AST type, something like: type delimiter = char * char type separator = char type sexp = Atom of string | Series of delimiter * separator * sexp array | SpecialForm of string * sexp array * sexp array;; (I haven't handled everything here, like infix operators.) Then all the quotations of MLast.expr could be rewritten to target this grammar, and the parser could be copied to write into these expressions. When future EXTEND statements add things, the corresponding entries in the AST could be added. The downside is obviously that our exhaustiveness checking is defeated. Always the price one pays for dynamicism. Something new and different which would be possible if an alternate AST like this were substituted is the creation of a "universal" indentation mode, which would indent any traditional syntax, revised syntax, or any extensions thereof. Preferences would be reduced to a lookup table which mapped special forms to their indentation. "if" => +2 "let" => +2 "in" => -2 (there would probably have to be more context here) As for my use case of nonlocal macro transformations, the low road is of course to project my special forms into "reserved" identifiers which look like values but are actually still special forms. I run my transformations with special cases to catch my transformed special forms. This is not horrible for the use case I have in mind now. Thanks for any help. Regards, Jeff Henrikson