From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on yquem.inria.fr X-Spam-Level: X-Spam-Status: No, score=0.0 required=5.0 tests=AWL autolearn=disabled version=3.1.3 Received: from mail3-relais-sop.national.inria.fr (mail3-relais-sop.national.inria.fr [192.134.164.104]) by yquem.inria.fr (Postfix) with ESMTP id 91A75BBC1 for ; Thu, 6 Mar 2008 10:53:09 +0100 (CET) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AgAAAAdOz0fAXQImh2dsb2JhbACQeQEBAQgKKZpE X-IronPort-AV: E=Sophos;i="4.25,455,1199660400"; d="scan'208";a="9939983" Received: from discorde.inria.fr ([192.93.2.38]) by mail3-smtp-sop.national.inria.fr with ESMTP; 06 Mar 2008 10:53:09 +0100 Received: from mail2-relais-roc.national.inria.fr (mail2-relais-roc.national.inria.fr [192.134.164.83]) by discorde.inria.fr (8.13.6/8.13.6) with ESMTP id m269qsAY030120 (version=TLSv1/SSLv3 cipher=RC4-SHA bits=128 verify=OK) for ; Thu, 6 Mar 2008 10:53:08 +0100 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AgAAAFJNz0fBL1AZi2dsb2JhbACQeQEBAQgEBgcIGppA X-IronPort-AV: E=Sophos;i="4.25,455,1199660400"; d="scan'208";a="8065389" Received: from gw.exalead.com (HELO exalead.com) ([193.47.80.25]) by mail2-smtp-roc.national.inria.fr with ESMTP; 06 Mar 2008 10:53:08 +0100 Received: from [192.168.204.148] (madpc064.exalead.com [192.168.204.148]) (authenticated bits=0) by exalead.com (8.14.0/8.14.0) with ESMTP id m269r8je024331; Thu, 6 Mar 2008 10:53:08 +0100 Message-ID: <47CFBF04.9030703@exalead.com> Date: Thu, 06 Mar 2008 10:53:08 +0100 From: Berke Durak User-Agent: Thunderbird 1.5.0.10 (X11/20070221) MIME-Version: 1.0 To: Berke Durak Cc: Caml-list List Subject: Re: [Caml-list] Canonical Set/Map datastructure? References: <47CECF23.1020508@exalead.com> In-Reply-To: <47CECF23.1020508@exalead.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Miltered: at discorde with ID 47CFBEF6.000 by Joe's j-chkmail (http://j-chkmail . ensmp . fr)! X-Spam: no; 0.00; berke:01 durak:01 berke:01 durak:01 functors:01 hash-consing:01 hashtable:01 worst-case:01 combinator:01 polymorphic:01 caml-list:01 marshal:01 arbitrary:02 data:02 modules:02 Berke Durak a écrit : > The Map and Set modules use AVL trees which are efficient but not > canonical - a given > set of elements can have more than one representation. This means that > you cannot use > ad hoc comparison on sets and maps, and this is why they are presented > as functors. > > Does anyone know if, in the many years that have passed since the > implementation of > those fine modules, someone has invented a (functional) datastructure > that is as > efficient while being canonic? > > That would permit polymorphic set and map modules that work correctly on > sets of sets, for > instance. Of course, the order induced on sets by the adhoc comparison > doesn't have to > be a useful order; just being a good order would suffice. Thanks for all your replies. I did not know that Patricia trees were canonical. However, the idea of combining hash-consing and Patricia trees, however elegant, does not suit my problem. Basically, you are cheating by using an auxiliary data structure, the hashtable (which is also O(n^2) worst-case). As I was improving my IO combinator library with sets and maps, the structures need to be self-contained, and not need a description as a bitstring (which could be done by using Marshal.to_string but I don't think the performance would be there). Maybe some wizardry relying on the physical representation of objects would permit storage of arbitrary values in Patricia trees, but I remain skeptical. -- Berke DURAK