From mboxrd@z Thu Jan  1 00:00:00 1970
Received: (from weis@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id AAA07617 for caml-red; Sun, 23 Jul 2000 00:03:21 +0200 (MET DST)
Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id PAA19959 for <caml-list@pauillac.inria.fr>; Sat, 22 Jul 2000 15:34:24 +0200 (MET DST)
Received: from mail.cs.uu.nl (sunset.cs.uu.nl [131.211.80.32])
	by nez-perce.inria.fr (8.10.0/8.10.0) with ESMTP id e6MDYNf11797
	for <caml-list@inria.fr>; Sat, 22 Jul 2000 15:34:23 +0200 (MET DST)
Received: from silvester.cs.uu.nl.cs.uu.nl (silvester.cs.uu.nl [131.211.80.119])
	by mail.cs.uu.nl (Postfix) with ESMTP
	id C0705451F; Sat, 22 Jul 2000 15:34:17 +0200 (MET DST)
From: Frank Atanassow <franka@cs.uu.nl>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Message-ID: <14713.41688.558933.239829@silvester.cs.uu.nl>
Date: Sat, 22 Jul 2000 15:34:16 +0200 (MET DST)
To: Markus Mottl <mottl@miss.wu-wien.ac.at>
Cc: Frank Atanassow <franka@cs.uu.nl>, John Gerard Malecki <johnm@artisan.com>,
        OCAML <caml-list@inria.fr>
Subject: Re: [newbie] Define and use records in sum types
In-Reply-To: <20000721220058.A29053@miss.wu-wien.ac.at>
References: <wd8d7kkewbr.fsf@parate.irisa.fr>
	<20000717120151.A32148@miss.wu-wien.ac.at>
	<14709.63462.792269.194367@ish.artisan.com>
	<20000719221048.B23676@miss.wu-wien.ac.at>
	<14712.16572.925353.202223@silvester.cs.uu.nl>
	<20000721220058.A29053@miss.wu-wien.ac.at>
X-Mailer: VM 6.75 under Emacs 20.4.1
Sender: weis@pauillac.inria.fr

[reusing record labels by putting them in different modules]

Markus Mottl writes:
 > On Fri, 21 Jul 2000, Frank Atanassow wrote:
 > > But I stopped because inevitably, after some development, the type t
 > > becomes recursive, and you cannot have recursive types spanning modules,
 > > so I would have to go back to the straightforward method.
 > 
 > Inevitably? Why?

Because non-recursive datatypes are trivial, and baby programs are trivial,
but they grow up into non-trivial adults.

 > At least in the cases where I found a problem with building modules, I
 > always discovered an easy (and generally much cleaner) way to restructure
 > the design to get around it. In fact, such problems point me to design
 > errors. (Yes, there are rare examples, where redesigning does not help, but
 > I never felt the need to implement one for real purposes).

Let's not get into discussions about refactoring or recursive modules.

The fact is that in your example you used only the namespace-handling aspect
of modules to achieve field label overloading. The restriction against modules
being recursive has nothing to do with their namespace-handling aspect, but
rather implementation and type-theoretic issues. If you had opted to solve the
problem by just adding a prefix to the labels (or used my proposed
mechanism), and kept them in the same module, then you would not have had to
appeal to (largely unrelated) issues like refactoring, and would have happily
let there be recursive dependencies among the types in question, if the
situation warranted it.

What I'm saying is that, because the module system treats both the
namespace-handling aspect and the type-theoretic aspects in one language
construct, the restrictions on one aspect carry over to the other. Therefore,
if you use the module system solely for namespace-handling, it's liable to
bite you in the end.

Of course for something like salary records, you aren't likely to engender
any recursion no matter how long you develop the program; my objection is to
the general applicability of the technique.

 > > It's true that the uses often coincide, but I think it obscures the
 > > issues. In particular, I think it encourages programmers to conflate
 > > syntactic information-hiding (like not exporting some constructors or
 > > functions) with type-theoretic information-hiding (like abstracting a
 > > type with a functor). The first ought to be considered only as an
 > > engineering technique to reduce complexity. The second OTOH is a logical
 > > technique to increase flexibility of implementation.
 > 
 > But how about "capturing logical invariants"? You need both ways of
 > information hiding there: restricting the number of exported functions
 > *and* restricting their types! Otherwise there could be ways to destroy the
 > invariant.

Can you be more specific about "capturing logical invariants"? I don't know
quite what you mean.

An instance I can think of where access to names could have semantic
significance is if it has to do with generativity. I hate generativity with a
passion, but eliminating it from ML modules would require a wholesale revision
of the module system, which it is not my object to propose here. So, I will
concede that generativity may cause name access to acquire semantic
significance. (But I hope you agree that it is at least desirable that names
and namespaces should have no semantic significance whatsoever.)

BTW, I inadvertantly omitted the most important reason for namespace-handling
constructs: to prevent conflicts and unwanted shadowing arising when you use
modules developed by different people.

 > > As it stands, though, I realize that the current module system is far too
 > > entrenched to change, so I would be happy if we could just qualify
 > > constructor and field labels by their type constructor's name, so that
 > > this would typecheck:
 > > 
 > >   type r1 = { name: int } type r2 = { name: float } type t1 = A of int
 > >   type t2 = A of float let (r2,t2) : r2 * t2 = { name = 1.0 }, A 2.0 let
 > >   (r1,t1) : r1 * t2 = { r1.name = 1 }, t1.A 2
 > 
 > I'd rather use different namings like "r1_name" and "r2_name" for the
 > field: although this requires me to always include the name of the type,
 > one cannot get confused either: it makes the sources easier to understand.

I don't find that argument very convincing; Ocaml _already_ allows type, value
and field label definitions to shadow each other, which many people consider a
cognitive burden. I don't have an opinion on that issue; I'm just trying to be
consistent with the way Ocaml treats bindings currently, and add a little bit
more flexibility.

In fact, I don't see how this makes programs any less readable.

If you don't add this mechanism, then the rule remains, "an identifier X
refers to the last definition of X in scope".

If you do add this mechanism, then that doesn't change (since the identifier
[path] "t.x" is still distinct from "x").

The only problem I see is that for the toplevel you will lose the ability to
throw away bindings that got shadowed by newer definitions, even if they never
appear inside some existing binding. But as far as I know, Ocaml's top-level
doesn't garbage collect old bindings anyway, so it's a non-issue for now
(unless the implementors intend to change this detail in the future).

-- 
Frank Atanassow, Dept. of Computer Science, Utrecht University Padualaan 14,
PO Box 80.089, 3508 TB Utrecht, Netherlands Tel +31 (030) 253-1012, Fax +31
(030) 251-3791