From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <garrigue@math.nagoya-u.ac.jp>
X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on yquem.inria.fr
X-Spam-Level: 
X-Spam-Status: No, score=0.4 required=5.0 tests=AWL,DNS_FROM_RFC_ABUSE 
	autolearn=disabled version=3.1.3
Received: from mail2-relais-roc.national.inria.fr (mail2-relais-roc.national.inria.fr [192.134.164.83])
	by yquem.inria.fr (Postfix) with ESMTP id 90DB3BBAF
	for <caml-list@yquem.inria.fr>; Fri, 11 Sep 2009 10:18:11 +0200 (CEST)
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: ApoEAB+kqUqCNhAB/2dsb2JhbADdG4QYBYFUZA
X-IronPort-AV: E=Sophos;i="4.44,369,1249250400"; 
   d="scan'208";a="32575773"
Received: from kurims.kurims.kyoto-u.ac.jp ([130.54.16.1])
  by mail2-smtp-roc.national.inria.fr with ESMTP/TLS/DHE-RSA-AES256-SHA; 11 Sep 2009 10:18:10 +0200
Received: from localhost (orion [130.54.16.5])
	by kurims.kurims.kyoto-u.ac.jp (8.13.8/8.13.8) with ESMTP id n8B8I1eG024762;
	Fri, 11 Sep 2009 17:18:02 +0900 (JST)
Date: Fri, 11 Sep 2009 17:18:00 +0900 (JST)
Message-Id: <20090911.171800.183022743.garrigue@math.nagoya-u.ac.jp>
To: guillaume.yziquel@citycable.ch
Cc: caml-list@inria.fr
Subject: Re: [Caml-list] polymorphic method.
From: Jacques Garrigue <garrigue@math.nagoya-u.ac.jp>
In-Reply-To: <4AA95027.5050308@citycable.ch>
References: <4AA8F986.1060206@citycable.ch>
	<20090910.232102.112610940.garrigue@math.nagoya-u.ac.jp>
	<4AA95027.5050308@citycable.ch>
X-Mailer: Mew version 6.2.51 on Emacs 22.3 / Mule 5.0 (SAKAKI)
Mime-Version: 1.0
Content-Type: Text/Plain; charset=iso-8859-1
Content-Transfer-Encoding: quoted-printable
X-Spam: no; 0.00; guillaume:01 guillaume:01 'self:01 'self:01 val:01 polymorphism:01 inference:01 compiler:01 inference:01 recursive:01 ocaml:01 ambiguities:01 syntax:01 syntax:01 camlp:01 

From: Guillaume Yziquel <guillaume.yziquel@citycable.ch>

> To continue on the example of nil: the current definition of nil
> (i.e. the one with type <hd : 'a.'a ; tl : 'b.'b>) would be written a=
s
> =

>> class nil : object
>>   polymorphic method hd =3D raise Empty
>>   polymorphic method tl =3D raise Empty
>> end
> =

> and the definition
> =

>> class nil : object
>>   method hd =3D raise Empty
>>   method tl =3D raise Empty
>> end
> =

> would not compile since the 'a and the 'b would not be bound in the
> definition of class nil.
> =

> Just to be sure we are talking about the same thing.

This is indeed what you are proposing. But from my point of view the
distinction is not really relevant: one can easily build examples
where methods should be polymorphic, but not maximally.
A typical example is fold:

class nil =3D object
  polymorphic method fold f x =3D x
end
class nil : object
  method fold : 'a 'b. 'a -> 'b -> 'b
end

class ['a] cons (h:'a) (t : 'self) =3D object (self : 'self)
  polymorphic method fold f x =3D f h (t#fold f x)
end
class  ['a] cons : 'a -> 'b -> object ('b)
  method fold : 'c. ('a -> 'c -> 'c) -> 'c -> 'c
end

As you can see, the right type is going to be inferred for cons (and a
polymorphic one!), but again the type for nil is wrong.
In order to obtain the right one, you must be fully explicit:
class ['a] nil =3D object
  polymorphic method fold (f : 'a -> 'b -> 'b) (x : 'b)  =3D x
end

>> Interestingly, currently you can define nil as an immediate object
>> and have immediately the right type without any annotation:
>> exception Empty
>> let nil =3D object
>>   method hd =3D raise Empty
>>   method tl =3D raise Empty
>> end ;;
>> val nil : < hd : 'a; tl : 'b > =3D <obj>
> =

> So you're saying that what I proposed up there, saying that it would
> not compile properly, indeed works out fine in the current setting?
> =

> Isn't that inconsistent to allow 'a coming from exceptions to get
> bound, while not doing so for some other 'a? (Sorry if I'm being
> unclear and using approximative terminology).

This doesn't work because of raise, but because I used an immediate
object rather than a class. Immediate objects may have implicit
polymorphism at the object level (not at the method level), while
everything must be explicit in classes (because classes define types al=
so).

> Humm... I still do not see why there would be ambiguity in choosing
> 'the most general' polymorphic type... Can't 'the most polymorphic
> one' be properly defined?

The most polymorphic one is properly defined, as a function of the
current type system (which may always be improved). But the most
polymorphic one is not necessarily the one you want, as my example
with nil shows. And once you have a polymorphic type, there exists an
infinity of other valid polymorphic types for the same method
(obtained by instantiating repeatedly type variables with ['a option]
for instance).

>> So the meaning of your "polymorphic" keyword would be: "give me the
>> most polymorphic type for this method, I hope I understand what it
>> will be, but if I'm wrong and it breaks my program I won't
>> complain". This may be practically ok,
> =

> Yes. For me, it would be OK. Even more if 'the most polymorphic one'
> is properly defined. I wouldn't worry too much about the 'I hope I
> understand what it will be' in this case.

Since type inference is properly defined, it is properly defined.
Unfortunately the rules are not available on paper, while this could
be done. It would still be a rather big definition, and the property
is, with lots of interactions, which is why I added the 'I hope I
understand'. Yet, I'm not too afraid of this part.

>> but this is a hard sell as a language feature. Particularly when you=

>> think that future versions of the compiler may be able to infer more=

>> polymorphic types, thanks to improvements in type inference, and
>> suddenly break your program.
> =

> I'd like to suggest something na=EFve: for each 'a type, try to bind =
it
> to the class definition (what you call monomorphic I think), and if
> you can't, then bind it to the method definition (what you call
> polymorphic). (This, of course, in the presence of the 'polymorphic'
> keyword).

I'm not sure what you mean by that. If the class definition has type
parameters, you could always choose to bind the remaining free type
variables to those type parameters, and be done with it. Of course
this is completely arbitrary, an probably not what you expect.

But this is not really a problem, because the approach I gave in my
previous mail could do the job correctly. In more detail:
* First infer monomorphic types for all methods (with eventually some
  type variables remaining), just like you would do for mutually
  recursive let-definitions. Only methods with explicit polymorphic
  types can be used polymorphically at this point.
* Mark all type variables that appear either in class parameters,
  class arguments, value fields or global references, as
  non-generalizable.
* In methods marked as polymorphic, generalize the remaining type
  variables to obtain the resulting type of the method.
* In public methods, remaining type variables that are not bound
  neither at the class or method level are an error for class
  definitions (but not for immediate objects).

This is a correct behaviour, and I believe it should do exactly what
you expect.  Thanks to the introduction of a new keyword for
polymorphic inference, it is also compatible with the current
behaviour for immediate objects.  But, as mentioned before, what it
provides is the most polymorphic type, not the principal type (a type
that would subsume all the possible ones), as there is no subsumption
on polymorphic method types.

So, should I go forward I implement it?
This would be nice to give it a try, but:
* type inference for classes is rather hairy, so I'm afraid it will
  not be easy to implement
* it is correct but not principal, and we don't like it very much in
  ocaml
* while it avoids having to write the method types by hand, you still
  need to know which methods will be polymorphic, so I'm not sure it
  will help beginners
* adding yet another keyword is not that great...

So if find time with nothing else to do I might try with a prototype,
but don't hold your breath.

At times I even think that all class definitions should be accompanied
by a class interface, with explicit types for all public methods.
This would certainly avoid all these ambiguities with variable binding
in types.

> I'm not really familiar with syntax extensions. Can this 'polymorphic=
'
> keyword be implemented with a syntax extension?

The keyword, yes; the behaviour, no.
And this goes much farther than the fact you have no access to types
in camlp4. Implementing it requires lots of changes to the type
inference itself.

Jacques Garrigue