From mboxrd@z Thu Jan 1 00:00:00 1970 Received: (from weis@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id UAA18719 for caml-redistribution; Fri, 29 Jan 1999 20:02:00 +0100 (MET) Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id TAA31119 for ; Fri, 29 Jan 1999 19:32:07 +0100 (MET) Received: from mail3.microsoft.com (mail3.microsoft.com [131.107.3.123]) by nez-perce.inria.fr (8.8.7/8.8.7) with ESMTP id TAA12719 for ; Fri, 29 Jan 1999 19:32:05 +0100 (MET) Received: by mail3.microsoft.com with Internet Mail Service (5.5.2524.0) id ; Fri, 29 Jan 1999 10:32:05 -0800 Message-ID: <39ADCF833E74D111A2D700805F1951EF0F00B94C@RED-MSG-06> From: Don Syme To: "'Markus Mottl'" Cc: caml-list@inria.fr Subject: Classes: when and where? (long) (was: looking for a nail) Date: Fri, 29 Jan 1999 10:32:01 -0800 X-Mailer: Internet Mail Service (5.5.2524.0) Sender: weis Markus writes: > let print x = x # print > List.iter print [3; "hello"; 3.14; ["yeah, another list!"]] I've used a few object systems, and I agree there are certainly abstractions you can express with classes that can't be expressed in basic OCaml. My question is more this: what do people reckon is a sensible way of deciding when to use class based abstraction? Having witnessed several OO projects sink into a morass of unnecessary inheritance hierarchies (and this with very intelligent people on the design teams), it's hard to believe the devices should be used indiscriminately. I think it would be useful if ultimately the OCaml manual included some discussion on this topic, as experience is gained. In all I've seen very few balanced assessments of the value of using classes as an abstraction device. O'Caml seems like an excellent place for making such an assessment because it's possible to compare solutions side by side, without resorting to the old fashioned rhetoric about the essential beauty of the lambda calculus or the purity of OO. ;-) So when is it worth using OO features? For waht it's worth, it seems to me you must at least have (a) more than one component/representation/type supporting the same interface (b) these components/representations/types also support other interfaces (c) each component corresponds to just one type (hence I've written component/representation/type above) (d) a non-trivial body of code that you judge to be worth making abstract over the interface (e) some indication that it won't be more appropriate to use other abstraction mechanisms, e.g. modules and (f) be pretty sure that what you're trying to express will actually be expressible in the object system you're using I mention (a) because it's worth remembering that many programs will only actually work with one implementation of a particular interface, in which case you've added some abstraction clutter for little gain. It may seem stupid, but I've seen many smart people over-abstract in this way - all they end up doing is creating unnecessary work for themselves. I mention (b) because many components only support one interface, and in this case there are alternative abstraction mechanisms. For example, Markus' example above could be achieved by modelling an interface as a record of function closures and some state shared between these closures. This is how formatters are defined in the Format module in the OCaml library. This technique is not as general as classes, since if you have a single component supporting multiple interfaces then things get very cumbersome, because you have to do all the "plumbing" by hand. However in some ways it is very conceptually simple and localized, and there is no need to introduce new concepts in a program design. I mention (c) because classes necessarily implement only one type, whereas modules can implement more. However, I would agree it is relatively rare to have module parameters (components) that implement more than one type, so it's probably not a big deal. I mention (d) because there's the slightly tangential question of whether to abstract at all. I think it's fair to say that sometimes it isn't particularly worthwhile to abstract common code that performs simple "plumbing" operations connecting different interfaces. People often forget this, and go to extremes to ensure their code is "maximally abstract". Some people like it as a kind of programming discipline, but personally I reckon that some "abstract" code is so short, and gets instantiated only a few times, that it's easier to write it two or three times rather than go to the bother of abstracting it. We all fall into the trap of over-abstraction sometime or another, since we tend to enjoy dreaming up abstractions more than getting our programming tasks done. Often we never see a pay back for the investment we make in our abstractions, in terms of overall cost-reduction. But I think it's fair to say that OO tends to encourage a tendency toward over-abstraction (just look at some of the ridiculous class hierarchies floating around), in addition to supporting some valid abstractions. I also mention (d) because there's the (again slightly tangential) question of _when_ to do abstraction. I think it's fair to say abstraction is usually best applied midway or late in the design process, or when we're trying to create a library for reuse. I agree with you that some programs can be simplified at a late date by judicious use of subtyping and other abstraction techniques. However, many OO folks I've met tend to try to design their subtype hierarchies early in a project, sometimes leading to disaster. This is because they wasted a lot of time discussing representational decisions that were of very dubious long term benefit. In real projects, the early part of the design process is where wasted time can hurt the most: after all, you probably should be spending the time coming up with a prototype to show the customers (just to check that you're building the right thing), rather than designing nice abstraction hierarchies that may or may not give better reuse. Often, of course, it is definitely worth writing abstract versions of algorithms or other code. I mention (e) because modules are sometimes the best solution to such problems. They're not perfect, e.g. the types produced from different applications of a paramaterized module are not compatible, when similar OO types might be, and this can tend to lead to a proliferation of pretty mindless modules. However, modules do at least admit statically type-checked binary methods and so on in a pretty straight-forward fashion, and their semantics are pretty much good ol' substitution. They also permit simpler code optimization than dispatch mechanisms. This is why modules are a good choice for simple purposes such as the collection types in the OCaml library. Finally, even when we decide a design abstraction doesn't fit other mechanisms, it's not always easy to make it fit an object model either. For example, many object abstractions ultimately rely on casting, which is only available via Obj.magic in Ocaml. Similarly it's easy to get in trouble with contravariance/covariance issues, as recent questions have shown. At a design level, it's not easy to predict in advance the methods that we might want to support for different representations, and nor does it seem "right" to equip types with every possible behaviour at the time of their creation (e.g. you clearly don't want to equip every type with every posssible visualisation apparatus). That is, it's often difficult to come up with a classification hierarchy (and/or naming conventions for methods) that will allow any sensible reuse. It's not impossible, of course, it's just not that easy. Personally, I think the apparent preference of OCaml users to stick with the basic language may be telling - or have I misjudged this? Comments welcome! > There is probably hardly any > program, where > you wouldn't need such constructs. In summary, I think my point is that there are many programs (but not all - I don't want to be dogmatic about this) where the application of class based abstraction techniques is inappropriate. For example, IMHO no compiler I've seen would benefit, nor any symbolic computation system, nor any module in the basic OCaml library apart from, perhaps, the Format library. > Imagine that all basic types were classes and all of them > would support > a method e.g. "print". All very well, but this is, after all, imaginary. The runtime cost of supporting a dictionary of methods for basic types is just too high (though I guess Haskell manages it via type classes). All of this is, of course, just my opinion! Don Syme P.S. Hey let's ensure this discussion continues on a low-key basis rather than descending into a flame fest?? Thanks!! :-) ------------------------------------------------------------------------ At the lab: At home: Microsoft Research Cambridge 11 John St St George House CB1 1DT Cambridge, CB2 3NH, UK Ph: +44 (0) 1223 744797 Ph: +44 (0) 1223 722244 http://research.microsoft.com/users/dsyme email: dsyme@microsoft.com "You've been chosen as an extra in the movie adaptation of the sequel to your life" -- Pavement, Shady Lane ------------------------------------------------------------------------