From mboxrd@z Thu Jan  1 00:00:00 1970
Received: (from weis@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id QAA26716 for caml-red; Sat, 30 Sep 2000 16:26:21 +0200 (MET DST)
Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id QAA17285 for <caml-list@pauillac.inria.fr>; Sat, 30 Sep 2000 16:25:56 +0200 (MET DST)
Received: from pauillac.inria.fr (pauillac.inria.fr [128.93.11.35])
	by nez-perce.inria.fr (8.10.0/8.10.0) with ESMTP id e8UEPWX04722;
	Sat, 30 Sep 2000 16:25:32 +0200 (MET DST)
Received: (from weis@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id QAA17425; Sat, 30 Sep 2000 16:25:32 +0200 (MET DST)
From: Pierre Weis <Pierre.Weis@inria.fr>
Message-Id: <200009301425.QAA17425@pauillac.inria.fr>
Subject: Re: bottom types and threaded exits
In-Reply-To: <wxzokqtf8r.fsf@suburbia.net> from Julian Assange at "Sep 30, 100 08:12:52 pm"
To: proff@iq.org (Julian Assange)
Date: Sat, 30 Sep 2000 16:25:32 +0200 (MET DST)
Cc: caml-list@inria.fr
X-Mailer: ELM [version 2.4ME+ PL28 (25)]
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Sender: weis@pauillac.inria.fr

> Xavier Leroy <Xavier.Leroy@inria.fr> writes:
> 
> > That's one way of putting it, but as Pierre Weis mentioned, fully
> > polymorphic types such as 'a appear naturally during type inference
> > for non-terminating expressions.
> 
> Yes, I found that quite cool.

Ok, but note that there is nothing special in the compiler to do so:
it is a mere consequence of the Damas-Milner typing discipline.

> > Semantically, it is correct to assign type 'a to any expression that
> > never returns normally: there are no values of type 'a, but such
> > expressions never result in a value.
> 
> It depends on what level of semantics you are looking at. As far as
> type proofing is concerned, 'a is correct. However at a higher level
> you can say `the function never returns so saying that it returns any
> type is incorrect'. The problem is that some functions that do return,
> return 'a. Ocaml's overloading of 'a, is much the same as lisps
> overloading of nil and C's overloading of 0 or void*. You might
> counter saying that 'a is defined in terms of type-unification, thus
> reading additional meaning into the type is bogus. But such a view
> denies the semantics of what is being typed. A computer language --
> one of the important semantics of which is that is capable of capable
> of generating functions that fail to return.

I think there is some misunderstanding here. The meaning of type
variables (those 'a) can be semantically quite different depending on
their binding status. This is not random stuff: it is based on
theoretical grounds, such as:

Theorem: There is no Caml value with type 'a (meaning for all 'a. 'a).

Theorem: If the target type of a function is a universally quantified
type variable that does not appear in the source type of the function,
then this function never returns: it fails to terminate or raises an
exception.

The converse property is false, since a function may loop without having
such a polymorphic type.

Examples: all of the following functions loop for ever; for some of
them this can be proved from the mere examination of their type, for
others it cannot.

# let rec f x = f x;;
val f : 'a -> 'b = <fun>

This 'b is free in the source type of f: f loops or fail.

# let rec g x = g (x + 1);;
val g : int -> 'a = <fun>

This 'a is free in the source type of g: g loops or fails.

# let rec h x = h x + 1;;
val h : 'a -> int = <fun>

This 'a is bound in the source type of h, we know that h is
polymorphic but we cannot conclude something about its termination.

# let rec i x = i (i x);;
val i : 'a -> 'a = <fun>

This 'a in the target type is bound in the source type: we cannot
conclude something about the termination of i from its type (although,
the code of i clearly fails to terminate).

# let rec j x = j (x + 1) + 1;;
val j : int -> int = <fun>

j has a regular type, we cannot conclude something about its
termination.

> `the function never returns so saying that it returns any
> type is incorrect'.

You mean for instance that the type of the preceding j function
(int -> int) is wrong since j is looping and never returns anything ?

Sorry, nobody can implement such a typing discipline due to the
well-know halting problem.

> The problem is that some functions that do return,
> return 'a.

Oversimplified and confusing. You confuse 'a type variables that are
bound in the source type of the function with 'a type variables that
are not. You should say ``some functions that do return, return 'a
that is bound into their source type'', and ``no functions that do
return, return an 'a that is unbound into their source type'' (this is
also a bit oversimplified, we should introduce the notion of positive
and negative occurrence of type variables...).

> Ocaml's overloading of 'a, is much the same as lisps
> overloading of nil and C's overloading of 0 or void*.

You should see now that there is no notion of overloading of type
variables in Caml. Nor any relationship with any nil NULL or void*
polymorphic default value.

> You might counter saying that 'a is defined in terms of
> type-unification, thus reading additional meaning into the type is
> bogus.

On the contrary, reading additional meaning into the type is a good
idea, I tried to give some simple examples in this message. But you
can use types for many more semantical properties (exceptions, safety
of programs, ...). Thus ``reading additional meaning into the type'' is
really a good idea.

> But such a view denies the semantics of what is being typed. A
> computer language -- one of the important semantics of which is that
> is capable of capable of generating functions that fail to return.

Absolutely, we do want to be able to define functions that fail to
return. That's why implementing a view such as: ``the type must tell
if the function may loop'' is hopeless, if you have general recursion in
the language. If you accept to abandon general recursion, you may
gain some much stronger typing disciplines (e.g. dependent types in
Coq) that gives you stronger properties of ``what is being types''.

> > Thread.exit should also have type unit -> 'a; the reason it currently
> > has type unit -> unit is due to the way it is implemented
> > (as Thread.kill(Thread.self())), but this is more of an historical
> > accident.
> 
> I've used:
> 
>    let exit = Thread.exit (); assert false
> 
> to get the required type. While clearly a hack, it's better to use a
> simple construct like this instead of having a casting to 'a
> mechanism. Although I find full-bown determinacy types, mercury style
> (but god forbid mercury heavy handed type syntax) quite appealing.

What do you want here:
- an exit function (then you must write let exit () = ...) ? 
- or a value of type 'a (that you cannot have in Caml, as is well-known) ?

In the first case, why not using a recursive function, such as:

let rec exit () = Thread.exit (); exit ();;
val exit : unit -> 'a = <fun>

In the second case, you will not have it. You will write an expression
that fails to return the desired value!

> > Finally, concerning "while true do e done", there is only one typing
> > rule in the type-checker for "while p do e done", which assumes that
> > "p" may become false and the loop may thus terminate with the result
> > "()", hence the "unit" type.
> 
> This is fair. I think an assert false at the end of the loop, or
> 
> let bottom f = f (); assert false
> bottom (fun () -> while true do () done)
> 
> is effective.

Once more I would prefer a more general recursive function, that would
loop for any (even non looping) function. For instance:

# let rec bottom f = f (); bottom f;;     
val bottom : (unit -> 'a) -> 'b = <fun>

and I consider the 'b in the type of bottom as clearly indicating the
looping for ever nature of any application of the bottom function.

> Although, once again, it'd be nice to see a 'bottom rather
> than 'a type come out of this, even if the two are semantically equivalent
> in the eye of the type checker.

As you may know the type parameters in type schemes are bound at the
beginning of the type scheme (prenex quantification). Hence the names
of type parameters are irrelevant. Hence 'bottom -> 'bottom and 'a ->
'a are equivalent. Anyway, we can imagine to replace negative
occurrences of type parameters that do not appear in positive
position by names that you would prefer : 'bottom, 'bottom1, 'bottom2,
etc. I don't know if the extra implementation burden is worth the
little (subliminal) additional information.

Pierre Weis

INRIA, Projet Cristal, Pierre.Weis@inria.fr, http://cristal.inria.fr/~weis/