From mboxrd@z Thu Jan 1 00:00:00 1970 Received: (from majordomo@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id PAA23570; Thu, 20 Mar 2003 15:45:42 +0100 (MET) X-Authentication-Warning: pauillac.inria.fr: majordomo set sender to owner-caml-list@pauillac.inria.fr using -f Received: (from xleroy@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id PAA24437 for caml-list@pauillac.inria.fr; Thu, 20 Mar 2003 15:45:41 +0100 (MET) Received: from concorde.inria.fr (concorde.inria.fr [192.93.2.39]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id DAA07310 for ; Thu, 20 Mar 2003 03:33:52 +0100 (MET) Received: from bastion.artisan.com ([209.144.161.130]) by concorde.inria.fr (8.11.1/8.11.1) with ESMTP id h2K2Xpf15454 for ; Thu, 20 Mar 2003 03:33:51 +0100 (MET) Received: from ypmaster.artisan.com (ypmaster [172.16.2.1]) by bastion.artisan.com (8.9.2/8.9.2) with ESMTP id SAA11803 for ; Wed, 19 Mar 2003 18:33:50 -0800 (PST) Received: from barrow.artisan.com (barrow [172.16.10.17]) by ypmaster.artisan.com (8.9.2/8.9.2-arti) with ESMTP id SAA08326 for ; Wed, 19 Mar 2003 18:33:49 -0800 (PST) Received: (from johnm@localhost) by barrow.artisan.com (8.11.6/8.11.6) id h2K2Xnm21643; Wed, 19 Mar 2003 18:33:49 -0800 From: John Gerard Malecki MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <15993.10380.206589.498703@barrow.artisan.com> Date: Wed, 19 Mar 2003 18:33:48 -0800 To: Caml List Subject: [Caml-list] Wanted - General Purpose "Glue Logic" Data-Structures X-Mailer: VM 7.07 under Emacs 21.2.1 Reply-To: "John Gerard Malecki" X-Organization: Artisan Components, Inc. X-Spam: no; 0.00; glue:01 arsenal:99 real-world:01 pervasives:01 randomized:01 re-writing:01 re-write:01 book-keeping:01 ocaml:01 constructs:02 tree:02 gerard:02 objects:02 wrote:03 interface:03 X-Spam: no; 0.00; glue:01 arsenal:99 real-world:01 pervasives:01 randomized:01 re-writing:01 re-write:01 book-keeping:01 ocaml:01 constructs:02 tree:02 gerard:02 objects:02 wrote:03 interface:03 Sender: owner-caml-list@pauillac.inria.fr Precedence: bulk I want to broaden my arsenal of data-structures that interface data between the specific structures that some algorithms require. Here is an example of a real problem that I solved too specifically: I had an existing reader that produced a list of objects, 'a list. I wrote a filter which fractured those items producing 'a list list. I then wanted to feed that data to an existing program which builds a balanced tree top-down. It expects 'a list which it then passes to a median finder which prefers its input list to be in random order. In the great ocaml tradition this worked just fine and the first time. (What a great language.) To really test things I then ran it on MOADB, the mother of all data bases, a very large but real-world data-base. The program broke down in 2 places. First, List.concat was used to convert the output of the fracturer from 'a list list to 'a list. Not only is List.concat not tail-recursive but @ (Pervasives.append) is also not tail-recursive. I modified the fracturer to directly produce 'a list. Second, the median program has some limitations. It not only prefers the input to be randomized but while it runs it too constructs some 'a list list. (Why? This median program returns not just the median but the set of items lt and gt the median.) I re-wrote the median program. I would prefer not to keep re-writing programs and instead have a better collection intermediate data-structures that I can use to glue my programs together efficiently and safely. One can argue that the median program was already flawed and it was best to re-write it. Still, it would be nice to have a data-structure which the fracturer could produce that not only can deal with large data sets but also has the randomizing property. Of course I want all this at very little expense as none of these manipulations are germane to the real problem at hand. I considered having the fracturer build a priority queue over random numbers but all that balancing book-keeping seems expensive. I guess what I'm looking for are inexpensive data-structures that have properties like - fast computation of cardinality - fast addition of single elements - fast addition of lists of elements - fast removal of single elements in random order - not choking on a large data size Any suggestions? Does anyone have other useful glue-logic data-structures which they use? ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners