From mboxrd@z Thu Jan 1 00:00:00 1970 Received: (from majordomo@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id FAA32614; Sat, 1 May 2004 05:38:27 +0200 (MET DST) X-Authentication-Warning: pauillac.inria.fr: majordomo set sender to owner-caml-list@pauillac.inria.fr using -f Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id FAA32618 for ; Sat, 1 May 2004 05:38:26 +0200 (MET DST) Received: from smtp1.adl2.internode.on.net (smtp1.adl2.internode.on.net [203.16.214.181]) by nez-perce.inria.fr (8.12.10/8.12.10) with ESMTP id i412n4EV030605 for ; Sat, 1 May 2004 04:49:05 +0200 Received: from [192.168.1.200] (ppp116-155.lns1.syd2.internode.on.net [150.101.116.155]) by smtp1.adl2.internode.on.net (8.12.9/8.12.9) with ESMTP id i412n0Zq060322; Sat, 1 May 2004 12:19:00 +0930 (CST) Subject: Re: [Caml-list] [ANN] The Missing Library From: skaller Reply-To: skaller@users.sourceforge.net To: Brian Hurt Cc: Jon Harrop , caml-list In-Reply-To: References: Content-Type: text/plain Message-Id: <1083379738.2581.282.camel@pelican.wigram> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 (1.2.2-4) Date: 01 May 2004 12:48:59 +1000 Content-Transfer-Encoding: 7bit X-Miltered: at nez-perce with ID 40931020.000 by Joe's j-chkmail (http://j-chkmail.ensmp.fr)! X-Loop: caml-list@inria.fr X-Spam: no; 0.00; caml-list:01 sourceforge:01 2004:99 extlib:01 enums:01 iterator:01 extlib:01 dereference:01 enums:01 iterator:01 origins:01 conceptually:01 pointers:01 distinctions:01 mutability:01 Sender: owner-caml-list@pauillac.inria.fr Precedence: bulk On Sat, 2004-05-01 at 01:58, Brian Hurt wrote: > Just for the record, you *can* do this just fine in Ocaml. Ext-lib is > already doing this. This isn't a limitation of the language, it's a > feature lack of the core library. This isn't quite correct. Extlib enums are not the same as iterators. However it is true that an important part of the iterator concept is captured in Extlib. basic C++ iterators have 3 functions: dereference, advance, and compare-equal with strict invalidation conditions. More advanced iterators support total order comparisons, and allow copies to be held (as well as possibly reverse movement and random access) Extlib enums use a quite different concept. In particular C++ iterators have very strict interpretation of validity and copyability which Extlib does not specify. Also enums do not support positional comparisons, or subranging using a pair of iterator. In addition C++ iterators support insertion, using an iterator to find a position in a container, and numerous other properties all of which are directly related to their origins conceptually as pointers into arrays. > I don't think having lots of different types of iterators is all that > usefull. Once you get beyond a simple linear walk through the data > structure, the nature of the data structure becomes important. They're vital. The kinding of iterators as mentioned above is fundamental. You may be right that, for example, bidirectional or random iterators are less useful, but the distinctions between forward, input, and output iterators are crucial, and the lack of that distinction in Extlib effectively breaks the library. I won't touch it until this is fixed. Indeed, there is work in the C++ committee to FURTHER classify iterators, particularly in respect of mutability of the container and buffering of the dereference operation. Basically: a forward iterator is a position in a container which lives in memory. Provided you don't modify the container, iterators remain valid. Iteration does not consume the container in any way. Input iterators can be copied but only one is valid after advancing (the one you advanced). They are used to operate on streams where advancing requires a destructive operation. In type theoretic terms, something like: forward iterators work on inductive data types, input iterators on coinductive data types. The difference is utterly fundamental. Extlib tries to present a common interface for input and forward enumerations. This is inconsistent. What needs changing may well ONLY be documentation, I'm not sure. Enums, like iterators, are intrinsically unsafe to use. The conditions under which the interface will yield stated results must be specified *precisely and pedantically*. And it is NOT easy to do so. Witness STL. To give an example: suppose you have two stream handles and apply a double Extlib.Enum.iter2 on them. By specification, that causes a force operation. It seems clear, right? It isn't. What happens if its the SAME enumeration? In C++ STL this is handled because there is a STRICT interpretation of an algorithm like: while(p!=e) cout << *p++ << *q++; Here ISO C rules make it clear that the next two elements of the input sequence get printed in an arbitrary order, then repeat. Extlib iter makes no guarrantees at all. It might well crash immediately because it forces one of the handles completely first, so the second one is already exhausted because its the same iteration : this is sure to be an issue if the enumeration is a generator function. Note in the C++ example it is trivial to *enforce* an ordering: while(p!=e) { cout << *p++; cout << *q++; } Now we know the output of a single generator will be strictly ordered. Note that this assumes we have a handle! If it isn't a handle kind of iterator, then p++ immediately invalidates q, and so q++ is undefined. So specifying the semantics of the above algorithm for an STL iterator is VERY HARD! It definitely requires a sophisticated classification scheme for iterators. There's no way around this: there is a compromise between a complex set of iterator abstractions, and simply accepting 'undefined behaviour' as a specification for an algorithm. Extlib can easily err on the side of 'undefined'. But it must be pedantic about what IS defined and what is not or there is no way to trust it to integrate memory containers and streams. Last I looked a symptom of the design fault is found in the 'fast_count' function. Whilst that exists, the library is necessarily flawed. Arguments about 'its useful' do not hold any water: it has to be justified in terms of a specified abstraction or thrown out. At least part of the problem here is that Ocaml is primarily a functional language, and functional languages can't handle stateful programming easily. Iterators are intrinsically stateful: thats the whole point of them. -- John Skaller, mailto:skaller@users.sf.net voice: 061-2-9660-0850, snail: PO BOX 401 Glebe NSW 2037 Australia Checkout the Felix programming language http://felix.sf.net ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners