From mboxrd@z Thu Jan  1 00:00:00 1970
Received: (from weis@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id UAA18719 for caml-redistribution; Fri, 29 Jan 1999 20:02:00 +0100 (MET)
Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id TAA31119 for <caml-list@pauillac.inria.fr>; Fri, 29 Jan 1999 19:32:07 +0100 (MET)
Received: from mail3.microsoft.com (mail3.microsoft.com [131.107.3.123])
	by nez-perce.inria.fr (8.8.7/8.8.7) with ESMTP id TAA12719
	for <caml-list@inria.fr>; Fri, 29 Jan 1999 19:32:05 +0100 (MET)
Received: by mail3.microsoft.com with Internet Mail Service (5.5.2524.0)
	id <C87YRPN0>; Fri, 29 Jan 1999 10:32:05 -0800
Message-ID: <39ADCF833E74D111A2D700805F1951EF0F00B94C@RED-MSG-06>
From: Don Syme <dsyme@microsoft.com>
To: "'Markus Mottl'" <mottl@miss.wu-wien.ac.at>
Cc: caml-list@inria.fr
Subject: Classes: when and where? (long)  (was: looking for a nail)
Date: Fri, 29 Jan 1999 10:32:01 -0800
X-Mailer: Internet Mail Service (5.5.2524.0)
Sender: weis


Markus writes:

>  let print x = x # print
>  List.iter print [3; "hello"; 3.14; ["yeah, another list!"]]

I've used a few object systems, and I agree there are certainly abstractions
you can express with classes that can't be expressed in basic OCaml.  My
question is more this: what do people reckon is a sensible way of deciding
when to use class based abstraction?  Having witnessed several OO projects
sink into a morass of unnecessary inheritance hierarchies (and this with
very intelligent people on the design teams), it's hard to believe the
devices should be used indiscriminately.  

I think it would be useful if ultimately the OCaml manual included some
discussion on this topic, as experience is gained.  In all I've seen very
few balanced assessments of the value of using classes as an abstraction
device. O'Caml seems like an excellent place for making such an assessment
because it's possible to compare solutions side by side, without resorting
to the old fashioned rhetoric about the essential beauty of the lambda
calculus or the purity of OO.  ;-)

So when is it worth using OO features?  For waht it's worth, it seems to me
you must at least have 
	(a) more than one component/representation/type supporting the same
interface
      (b) these components/representations/types also support other
interfaces 
      (c) each component corresponds to just one type (hence I've written
component/representation/type above)
	(d) a non-trivial body of code that you judge to be worth making
abstract over the interface 
      (e) some indication that it won't be more appropriate to use other
abstraction mechanisms, e.g. modules
  and (f) be pretty sure that what you're trying to express will actually be
expressible in the object system you're using

I mention (a) because it's worth remembering that many programs will only
actually work with one implementation of a particular interface, in which
case you've added some abstraction clutter for little gain.  It may seem
stupid, but I've seen many smart people over-abstract in this way - all they
end up doing is creating unnecessary work for themselves.

I mention (b) because many components only support one interface, and in
this case there are alternative abstraction mechanisms.  For example,
Markus' example above could be achieved by modelling an interface as a
record of function closures and some state shared between these closures.
This is how formatters are defined in the Format module in the OCaml
library.  This technique is not as general as classes, since if you have a
single component supporting multiple interfaces then things get very
cumbersome, because you have to do all the "plumbing" by hand.  However in
some ways it is very conceptually simple and localized, and there is no need
to introduce new concepts in a program design.  

I mention (c) because classes necessarily implement only one type, whereas
modules can implement more.  However, I would agree it is relatively rare to
have module parameters (components) that implement more than one type, so
it's probably not a big deal. 

I mention (d) because there's the slightly tangential question of whether to
abstract at all.  I think it's fair to say that sometimes it isn't
particularly worthwhile to abstract common code that performs simple
"plumbing" operations connecting different interfaces.  People often forget
this, and go to extremes to ensure their code is "maximally abstract".  Some
people like it as a kind of programming discipline, but personally I reckon
that some "abstract" code is so short, and gets instantiated only a few
times, that it's easier to write it two or three times rather than go to the
bother of abstracting it.  We all fall into the trap of over-abstraction
sometime or another, since we tend to enjoy dreaming up abstractions more
than getting our programming tasks done.  Often we never see a pay back for
the investment we make in our abstractions, in terms of overall
cost-reduction.  But I think it's fair to say that OO tends to encourage a
tendency toward over-abstraction (just look at some of the ridiculous class
hierarchies floating around), in addition to supporting some valid
abstractions.

I also mention (d) because there's the (again slightly tangential) question
of _when_ to do abstraction.  I think it's fair to say abstraction is
usually best applied midway or late in the design process, or when we're
trying to create a library for reuse.  I agree with you that some programs
can be simplified at a late date by judicious use of subtyping and other
abstraction techniques.  However, many OO folks I've met tend to try to
design their subtype hierarchies early in a project, sometimes leading to
disaster.  This is because they wasted a lot of time discussing
representational decisions that were of very dubious long term benefit.  In
real projects, the early part of the design process is where wasted time can
hurt the most: after all, you probably should be spending the time coming up
with a prototype to show the customers (just to check that you're building
the right thing), rather than designing nice abstraction hierarchies that
may or may not give better reuse.

Often, of course, it is definitely worth writing abstract versions of
algorithms or other code.  I mention (e) because modules are sometimes the
best solution to such problems.  They're not perfect, e.g. the types
produced from different applications of a paramaterized module are not
compatible, when similar OO types might be, and this can tend to lead to a
proliferation of pretty mindless modules.  However, modules do at least
admit statically type-checked binary methods and so on in a pretty
straight-forward fashion, and their semantics are pretty much good ol'
substitution.  They also permit simpler code optimization than dispatch
mechanisms.  This is why modules are a good choice for simple purposes such
as the collection types in the OCaml library.

Finally, even when we decide a design abstraction doesn't fit other
mechanisms, it's not always easy to make it fit an object model either.  For
example, many object abstractions ultimately rely on casting, which is only
available via Obj.magic in Ocaml.  Similarly it's easy to get in trouble
with contravariance/covariance issues, as recent questions have shown.  At a
design level, it's not easy to predict in advance the methods that we might
want to support for different representations, and nor does it seem "right"
to equip types with every possible behaviour at the time of their creation
(e.g. you clearly don't want to equip every type with every posssible
visualisation apparatus).  That is, it's often difficult to come up with a
classification hierarchy (and/or naming conventions for methods) that will
allow any sensible reuse.  It's not impossible, of course, it's just not
that easy.  

Personally, I think the apparent preference of OCaml users to stick with the
basic language may be telling - or have I misjudged this?   Comments
welcome!

> There is probably hardly any 
> program, where
> you wouldn't need such constructs. 

In summary, I think my point is that there are many programs (but not all -
I don't want to be dogmatic about this) where the application of class based
abstraction techniques is inappropriate.  For example, IMHO no compiler I've
seen would benefit, nor any symbolic computation system, nor any module in
the basic OCaml library apart from, perhaps, the Format library.  

> Imagine that all basic types were classes and all of them 
> would support
> a method e.g. "print".

All very well, but this is, after all, imaginary.  The runtime cost of
supporting a dictionary of methods for basic types is just too high (though
I guess Haskell manages it via type classes).

All of this is, of course, just my opinion!
Don Syme

P.S. Hey let's ensure this discussion continues on a low-key basis rather
than descending into a flame fest??  Thanks!! :-) 


------------------------------------------------------------------------
At the lab:                                     At home:
Microsoft Research Cambridge                    11 John St
St George House                                 CB1 1DT
Cambridge, CB2 3NH, UK
Ph: +44 (0) 1223 744797                         Ph: +44 (0) 1223 722244
http://research.microsoft.com/users/dsyme
email: dsyme@microsoft.com
   "You've been chosen as an extra in the movie
        adaptation of the sequel to your life"  -- Pavement, Shady Lane
------------------------------------------------------------------------