From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail4-relais-sop.national.inria.fr (mail4-relais-sop.national.inria.fr [192.134.164.105]) by walapai.inria.fr (8.13.6/8.13.6) with ESMTP id p9Q9Rxcr011747 for ; Wed, 26 Oct 2011 11:27:59 +0200 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: ArEBAIXRp05KfVI0kGdsb2JhbABCqEBxCCIBAQEBCQkNBxQEIYFuAQEBBBICExkBGx0BAwwGBQQHOyIBEQEFARw7nwwKi1CCYIVGPYhwAgUKiGAElAqNMz2DcQ X-IronPort-AV: E=Sophos;i="4.69,408,1315173600"; d="scan'208";a="114692439" Received: from mail-ww0-f52.google.com ([74.125.82.52]) by mail4-smtp-sop.national.inria.fr with ESMTP/TLS/RC4-SHA; 26 Oct 2011 11:27:53 +0200 Received: by wwf25 with SMTP id 25so2502370wwf.9 for ; Wed, 26 Oct 2011 02:27:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=PK6chzrqfUKxU+yjFY/pi5w2XvoHRNhCGCriHt2zoOo=; b=lXZwlh8E045HEgYPgskZA1UPh9aW4zEKcNNP6QIjYSSliESzDqab0vVhJWQm6RsJ2C FmgaORj9rkhd1u+ijbqvfWCpdtoxCTfNmMqx4oCjks2blm75StV1Ao4CEndzHiieDx0F 6+4xK5vu4BKFjLq7V3+rLB41TQSZwfdZvdQOo= MIME-Version: 1.0 Received: by 10.216.208.169 with SMTP id q41mr670755weo.64.1319621273715; Wed, 26 Oct 2011 02:27:53 -0700 (PDT) Received: by 10.216.73.131 with HTTP; Wed, 26 Oct 2011 02:27:53 -0700 (PDT) In-Reply-To: <20111024210108.GE1838@siouxsie> References: <4EA54BD0.8090009@inria.fr> <1319460013.18639.90.camel@thinkpad> <20111024124617.GA2287@siouxsie> <1319461110.18639.96.camel@thinkpad> <20111024210108.GE1838@siouxsie> Date: Wed, 26 Oct 2011 11:27:53 +0200 Message-ID: From: Diego Olivier Fernandez Pons To: caml-list@inria.fr Cc: Gerd Stolpmann , Xavier Leroy , oliver Content-Type: multipart/alternative; boundary=0016e6dee8dc4205d704b030458f Subject: Re: [Caml-list] How to write an efficient interpreter --0016e6dee8dc4205d704b030458f Content-Type: text/plain; charset=ISO-8859-1 List, Thanks for your suggestions, here is a small summary of them with some comments. Options for interpreting a DSL ------------------------------------------- 1 - Classical optimised interpreter 2 - Combinator / partial application based interpreter 3 - Emitting bytecode for any VM 4 - Emitting Caml source and compile / run on the fly 5 - Retargeting Caml compiler (CamlP4 syntax extension + custom toplevel) Comments ---------------- 1. Is the most natural and first thing to try (working on optimising my naive interpreter right now) 5. Is the second most obvious idea : in the same way C is a portable assembler, core-ML is a portable functional language. If we could easily write a Lex/Yacc parser and generate Caml AST or some kind of quote and connect it to Caml, we could interpret many DST with little work. Moreover, generating Caml source is useful for debugging or profiling and can be compiled offline. One comment is that Lex/Yacc is a tool that any engineer can manage while CamlP4 is much more complex which increases maintainability risks. 4. Is the poor man's variant of 5. 2. Globally the idea is that evaluating Plus (Plus (Var "x", 3), 4) is slower than calling sum [| get env "x"; 3 ; 4 |] with a pre-compiled 'sum' function. There is already an interpreter for that same language that is targets C++ combinators : operator overloading and class hierarchy build an "AST" (2 + 3 * x -> expr) which instantiates a set of predefined operators (sum, forall, etc). The experience was however not good IMHO : despite a (naive) gc there are still permanent memory leaks and speed issues, part of the semantics is lost which requires some guessing, there are weird limitations like polymorphism that only works with 2 levels of nesting, finally the code basis grows uncontrollably. Maybe C++ is to blame there. In any case that's a second level of optimisation after 1. 3. Unless there is a 3rd part tool to emit the code, I won't get into this. It would allow native interaction with the customer chosen platform (Java, .NET, etc) but that's way too much trouble and risk as long as it is not a standard technique that any engineer can handle. Diego Olivier --0016e6dee8dc4205d704b030458f Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
=A0 =A0 List,

Thanks for your suggestion= s, here is a small summary of them with some comments.

=
Options for interpreting a DSL
-----------------------------= --------------

1 - Classical optimised interpreter
2 - Combi= nator / partial application based interpreter
3 - Emitting byteco= de for any VM
4 - Emitting Caml source and compile / run on the f= ly
5 - Retargeting Caml compiler (CamlP4 syntax extension + custom toplev= el)

Comments
----------------
=
1. Is the most natural and first thing to try (working on op= timising my naive interpreter right now)

5. Is the second most obvious idea : in the same way C = is a portable assembler, core-ML is a portable functional language. If we c= ould easily write a Lex/Yacc parser and generate Caml AST or some kind of q= uote and connect it to Caml, we could interpret=A0many DST with little work= . Moreover, generating Caml source is useful for debugging or profiling and= can be compiled offline.

One comment is that Lex/Yacc is a tool that any enginee= r can manage while CamlP4 is much more complex which increases maintainabil= ity risks.

4. Is the poor man's variant of 5.<= /div>

2.=A0Globally the idea is that evaluating Plus (Plus (V= ar "x", 3), 4) is slower than calling sum [| get env "x"= ;; 3 ; 4 |] with a pre-compiled 'sum' function.

There is already an interpreter for that same language that is targets= C++ combinators : operator overloading and class hierarchy build an "= AST" (2 + 3 * x -> expr<int>) which instantiates a set of pre= defined operators (sum, forall, etc).=A0The experience was however not good= IMHO : despite a (naive) gc there are still permanent memory leaks and spe= ed issues, part of the semantics is lost which requires some guessing, ther= e are weird limitations like polymorphism that only works with 2 levels of = nesting, finally the code basis grows uncontrollably. Maybe C++ is to blame= there. In any case that's a second level of optimisation after 1.

3. Unless there is a 3rd part tool to emit the code, I = won't get into this. It would allow native interaction with the custome= r chosen platform (Java, .NET, etc) but that's way too much trouble and= risk as long as it is not a standard technique that any engineer can handl= e.

=A0 =A0 =A0 =A0 Diego Olivier
--0016e6dee8dc4205d704b030458f--