From mboxrd@z Thu Jan  1 00:00:00 1970
Received: (from weis@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id JAA31450 for caml-red; Sat, 7 Oct 2000 09:37:28 +0200 (MET DST)
Received: from concorde.inria.fr (concorde.inria.fr [192.93.2.39]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id KAA16759 for <caml-list@pauillac.inria.fr>; Fri, 6 Oct 2000 10:09:02 +0200 (MET DST)
Received: from pauillac.inria.fr (pauillac.inria.fr [128.93.11.35])
	by concorde.inria.fr (8.10.0/8.10.0) with ESMTP id e9687Vb01857;
	Fri, 6 Oct 2000 10:07:32 +0200 (MET DST)
Received: (from xleroy@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id KAA17450; Fri, 6 Oct 2000 10:07:31 +0200 (MET DST)
Message-ID: <20001006100731.12927@pauillac.inria.fr>
Date: Fri, 6 Oct 2000 10:07:31 +0200
From: Xavier Leroy <Xavier.Leroy@inria.fr>
To: David McClain <dmcclain@azstarnet.com>, eijiro_sumii@anet.ne.jp,
        caml-list@inria.fr, monnier+lists/caml/news/@RUM.CS.YALE.EDU
Subject: Re: WWW Page of Team PLClub (Re: ICFP programming contest: results)
References: <001f01c02eb7$b41c99c0$210148bf@dylan>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
X-Mailer: Mutt 0.89.1
In-Reply-To: <001f01c02eb7$b41c99c0$210148bf@dylan>; from David McClain on Thu, Oct 05, 2000 at 03:33:26AM -0700
Sender: weis@pauillac.inria.fr

David McClain asks:

> Could you provide a reference to de Bruijin indexing? I recall reading about
> it some time ago, and after having searched my books here, I cannot find any
> references to it.

The idea behind de Bruijn indices is to represent variable references
not by name, but by the number of binders to cross to get at the
binder for the variable in question.  An example shoud make this
clearer:

        fun x -> fun y -> x + y

is represented in de Bruijn notation as

        fun -> fun -> var1 + var0

"var1" means "go up one binder", so this refers to the variable bound
by the first "fun".
"var0" means "go to the closest binder", so this refers to the
variable bound by the second "fun".

Notice that a de Bruijn index is relative to the position of the
variable reference.  Hence the same variable can be referenced by
different de Bruijn indices depending on context, e.g.

        fun x -> print_int x; fun y -> x + y

becomes
        fun -> print_int var0; fun -> var1 + var0
                         ^^^^         ^^^^
                       this is x     this is x too!

De Bruijn indices are interesting for several reasons.  One is
that they totally eliminate any alpha-conversion problems: expressions
have unique representations, without any need to work modulo renaming of
bound variables; moreover, substitution of variables by non-closed
expressions works very well, without any risk of name clashes as in
the normal, name-based notation.

Another reason is that if you represent the evaluation environment as
a stack, with each new binding simply pushing the bound value on top,
then the de Bruijn index for a variable is simply the position of the
variable's value in the stack, relative to the top of the stack.
(Substitute "list" for "stack" to deal with persistent, heap-allocated
environments, e.g. for closures.)  This leads to very simple
translations into abstract machine code.

Stefan Monnier asks:

> On a related note, how would that compare (performancewise) to an
> approach like "abstract syntax" (represent a function not
> as (<id>, <exp>) and neither as <exp> (as in the case of deBruijn)
> but as fn x => <exp>) ?

My feeling is that higher-order abstract syntax isn't applicable to
the GML language, because the language has variable assignment, which
I don't see how to fit in the h.o.a.s. representation.

- Xavier Leroy