From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail1-relais-roc.national.inria.fr (mail1-relais-roc.national.inria.fr [192.134.164.82]) by walapai.inria.fr (8.13.6/8.13.6) with ESMTP id p07Kqbpt011747 for ; Fri, 7 Jan 2011 21:52:38 +0100 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: ApoAAPsLJ03RVdi0kGdsb2JhbACkIwgVAQEBAQkJDAcRBCCkIol4ghWEZC6FSgEBAwWFRwSLCYk/ X-IronPort-AV: E=Sophos;i="4.60,290,1291590000"; d="scan'208";a="94797791" Received: from mail-qy0-f180.google.com ([209.85.216.180]) by mail1-smtp-roc.national.inria.fr with ESMTP/TLS/RC4-MD5; 07 Jan 2011 21:52:34 +0100 Received: by qyk29 with SMTP id 29so17958681qyk.18 for ; Fri, 07 Jan 2011 12:52:33 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:mime-version:sender:received :in-reply-to:references:from:date:x-google-sender-auth:message-id :subject:to:content-type; bh=d3XCnYqL+A7zVt+licLLHcEtKQjeZMvZyd8+S9JGFdU=; b=r/8l9DMwJJJHb9WJczG6Ym/qtoxsWLbblFJS5fzuN8EYh84vvfYPDJSfksegtItdCM ml/+smsWA9o0me07H8IkO4Z9ANBCEuhNk304cUNPr8SqFXTNN08JWA4KdLnGG2n5Tpiq yyrusshE6yx6JoNVPjv8kxkY8/K1ZzTpJrHhI= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:from:date :x-google-sender-auth:message-id:subject:to:content-type; b=LB8ADv0GWE49Lzrd7HnjsD5XyD0JjKTImegBL7tnn6DUwsXalJCEtWdxv7fvkojLww BoUlu8BvZJioMXhxYes+BGCTtk/uVV13hgbL0NisS3Mvpsvt+x9SQK6BnCGzOQ4PjvEW mZ0t0gJJS62azRbjALe8KQcHNA4OwgivPeb+Y= Received: by 10.229.233.18 with SMTP id jw18mr22418137qcb.122.1294433553784; Fri, 07 Jan 2011 12:52:33 -0800 (PST) MIME-Version: 1.0 Sender: gabriel.scherer@gmail.com Received: by 10.229.23.139 with HTTP; Fri, 7 Jan 2011 12:52:13 -0800 (PST) In-Reply-To: References: <699537.6718.qm@web111509.mail.gq1.yahoo.com> <87vd20plpv.fsf@mid.deneb.enyo.de> From: bluestorm Date: Fri, 7 Jan 2011 21:52:13 +0100 X-Google-Sender-Auth: L0J5bSCS2-6vBPzxbf8bE9dZa1o Message-ID: To: caml-list caml-list Content-Type: multipart/alternative; boundary=0016363b8fbc289e3a049947ccab Subject: [Caml-list] Purity and lazyness --0016363b8fbc289e3a049947ccab Content-Type: text/plain; charset=ISO-8859-1 On Fri, Jan 7, 2011 at 8:17 PM, Florian Weimer wrote: > Essential laziness occurs when you construct values which would not > otherwise be possible to construct in a pure language. For instance, > if you have got some sort of compiler, at one point, you might want to > perform name resolution. At that point, you want to translate names > (references) to the program entities they denote (referents). In > general, these references do not form a tree (for instance, a function > body can refer to another function by name, which refers to the > original function). With laziness, you can compute a map from names > to entities, and the entities use this maps to resolve the names they > use as references. Name lookup is only performed once per name. > Without laziness, you have to perform name lookup each time you follow > a reference because the self-referential nature of the data structure > cannot be reflected in the program. > I'm not convinced by your example. You need lazyness because you want to build a cyclic data structure. But is this such a good idea ? In my experience, implicit cyclicity often raises more problems than it solves, and it is much safer to use an explicit indirection layer. In your example, the resolved identifiers could contain not the declaration code itself, but a unique key that is associated to those declaration sites. This can be done easily, with only one traversal. Then in a second pass you can record the mapping of keys to resolved declaration sites, and therefore tie the knot. Below is a simple code example: type unique_loc = int type name = string (* a simple language with identifiers and mutually recursive definitions *) type 'a term = | Name of 'a | Lets of ((name * unique_loc) * 'a term) list * 'a term (* unique_loc are supposed to be unique, names are not due to shadowing. In a source program, a identifier points to a name, which is ambiguous. To resolve it means to turn all such names into the unique identifiers corresponding to the binding declaration *) type input_term = name term type resolved_term = unique_loc term let resolve : input_term -> resolved_term = let rec resolve (env : (name * unique_loc) list) = function | Name n -> Name (List.assoc n env) | Lets (decls, body) -> let env' = List.map fst decls @ env in let resolve_decl (binder, code) = binder, resolve env' code in Lets (List.map resolve_decl decls, resolve env' body) in resolve [] (* we can easily build a table that, to each unique location, associates the declaration code; during further analysis, this may allow some handling of identifiers based on the expression they denote. *) let rec resolution_table : resolved_term -> (unique_loc * resolved_term) list = function | Name _ -> [] | Lets (decls, body) -> let decl_table ((_, loc), code) = (loc, code) :: resolution_table code in resolution_table body @ List.concat (List.map decl_table decls) (* if you want to test *) let input : input_term = Lets ([ ("x", 1), Name "y"; ("y", 2), Name "x"; ("z", 3), Lets ([ ("a", 4), Name "x"; ("b", 5), Name "y"; ], Name "z"); ], Lets ( [("x", 6), Name "x"], Name "z")) --0016363b8fbc289e3a049947ccab Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable On Fri, Jan 7, 2011 at 8:17 PM, Florian Weimer=A0<fw@deneb.enyo.de><= /span>=A0wrote:
Essential laziness occurs when you construct values which would not
othe= rwise be possible to construct in a pure language. =A0For instance,
if y= ou have got some sort of compiler, at one point, you might want to
perfo= rm name resolution. =A0At that point, you want to translate names
(references) to the program entities they denote (referents). =A0In
gene= ral, these references do not form a tree (for instance, a function
body = can refer to another function by name, which refers to the
original func= tion). =A0With laziness, you can compute a map from names
to entities, and the entities use this maps to resolve the names they
us= e as references. =A0Name lookup is only performed once per name.
Without= laziness, you have to perform name lookup each time you follow
a refere= nce because the self-referential nature of the data structure
cannot be reflected in the program.

I'm not convinced by your example. You need lazyness because you wan= t to build a cyclic data structure. But is this such a good idea ? In my ex= perience, implicit cyclicity often raises more problems than it solves, and= it is much safer to use an explicit indirection layer.

In your example, the resolved identifiers could contain= not the declaration code itself, but a unique key that is associated to th= ose declaration sites. This can be done easily, with only one traversal. Th= en in a second pass you can record the mapping of keys to resolved declarat= ion sites, and therefore tie the knot.

Below is a simple code example:

=A0=A0type unique_loc =3D int
=A0=A0type name =3D string
=A0=A0
=A0=A0(* a simple language with identifiers and m= utually recursive definitions *)
=A0=A0type 'a term =3D
=A0=A0 =A0| Name of 'a
<= div>=A0=A0 =A0| Lets of ((name * unique_loc) * 'a term) list * 'a t= erm
=A0=A0
=A0=A0(* unique_loc are supposed to be uniqu= e, names are not due to
=A0=A0 =A0 shadowing. In a source program, a identifier points to a na= me,
=A0=A0 =A0 which is ambiguous. To resolve it means to turn al= l such names into
=A0=A0 =A0 the unique identifiers corresponding= to the binding declaration
=A0=A0*)
=A0=A0type input_term =3D name term
=A0= =A0type resolved_term =3D unique_loc term
=A0=A0
=A0=A0= let resolve : input_term -> resolved_term =3D
=A0=A0 =A0let re= c resolve (env : (name * unique_loc) list) =3D function
=A0=A0 =A0 =A0| Name n -> Name (List.assoc n env)
=A0=A0 = =A0 =A0| Lets (decls, body) ->
=A0=A0 =A0 =A0 =A0let env' = =3D List.map fst decls @ env in
=A0=A0 =A0 =A0 =A0let resolve_dec= l (binder, code) =3D
=A0=A0 =A0 =A0 =A0 =A0binder, resolve env= 9; code in
=A0=A0 =A0 =A0 =A0Lets (List.map resolve_decl decls, resolve env' = body)
=A0=A0 =A0in resolve []
=A0=A0
=A0=A0(*= we can easily build a table that, to each unique location,
=A0= =A0 =A0 associates the declaration code; during further analysis, this may<= /div>
=A0=A0 =A0 allow some handling of identifiers based on the expression = they
=A0=A0 =A0 denote. *)
=A0=A0let rec resolution_tab= le : resolved_term -> (unique_loc * resolved_term) list =3D
= =A0=A0 =A0function
=A0=A0 =A0 =A0| Name _ -> []
=A0=A0 =A0 =A0| Lets (decls,= body) ->
=A0=A0 =A0 =A0 =A0let decl_table ((_, loc), code) = =3D
=A0=A0 =A0 =A0 =A0 =A0(loc, code) :: resolution_table code in=
=A0=A0 =A0 =A0 =A0resolution_table body @ List.concat (List.map = decl_table decls)
=A0=A0
=A0=A0(* if you want to test *)
=A0=A0let i= nput : input_term =3D
=A0=A0 =A0Lets ([
=A0=A0 =A0 =A0(= "x", 1), Name "y";
=A0=A0 =A0 =A0("y&quo= t;, 2), Name "x";
=A0=A0 =A0 =A0("z", 3),
=A0=A0 =A0 =A0 =A0Lets ([<= /div>
=A0=A0 =A0 =A0 =A0 =A0("a", 4), Name "x";
=A0=A0 =A0 =A0 =A0 =A0("b", 5), Name "y";
<= div>=A0=A0 =A0 =A0 =A0], Name "z");
=A0=A0 =A0],
=A0=A0 =A0 =A0Lets (
=A0=A0 =A0 =A0 = =A0[("x", 6), Name "x"],
=A0=A0 =A0 =A0 =A0Na= me "z"))


--0016363b8fbc289e3a049947ccab--