From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on yquem.inria.fr X-Spam-Level: X-Spam-Status: No, score=0.0 required=5.0 tests=AWL autolearn=disabled version=3.1.3 Received: from mail3-relais-sop.national.inria.fr (mail3-relais-sop.national.inria.fr [192.134.164.104]) by yquem.inria.fr (Postfix) with ESMTP id 4135ABBC1 for ; Thu, 6 Mar 2008 18:36:56 +0100 (CET) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AgAAAAO7z0fAXQInh2dsb2JhbACQeAEBAQgKKZsY X-IronPort-AV: E=Sophos;i="4.25,457,1199660400"; d="scan'208";a="9964454" Received: from concorde.inria.fr ([192.93.2.39]) by mail3-smtp-sop.national.inria.fr with ESMTP; 06 Mar 2008 18:36:55 +0100 Received: from mail4-relais-sop.national.inria.fr (mail4-relais-sop.national.inria.fr [192.134.164.105]) by concorde.inria.fr (8.13.6/8.13.6) with ESMTP id m26HatkN029136 (version=TLSv1/SSLv3 cipher=RC4-SHA bits=128 verify=OK) for ; Thu, 6 Mar 2008 18:36:55 +0100 X-IronPort-AV: E=Sophos;i="4.25,457,1199660400"; d="scan'208";a="23451418" Received: from mga09.intel.com ([134.134.136.24]) by mail4-smtp-sop.national.inria.fr with ESMTP; 06 Mar 2008 18:36:54 +0100 Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by orsmga102.jf.intel.com with ESMTP; 06 Mar 2008 09:36:52 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="4.25,457,1199692800"; d="scan'208";a="304145809" Received: from unknown (HELO azsmsx602.amr.corp.intel.com) ([10.2.163.6]) by fmsmga002.fm.intel.com with ESMTP; 06 Mar 2008 09:35:31 -0800 Received: from azsmsx501.amr.corp.intel.com ([10.2.167.99]) by azsmsx602.amr.corp.intel.com ([10.2.163.6]) with mapi; Thu, 6 Mar 2008 10:36:44 -0700 From: "Harrison, John R" To: Berke Durak Cc: Caml-list List Date: Thu, 6 Mar 2008 10:36:43 -0700 Subject: RE: [Caml-list] Canonical Set/Map datastructure? Thread-Topic: [Caml-list] Canonical Set/Map datastructure? Thread-Index: Ach/cAxKG8kxb+OaQB+DpIeFUc6viwAQFT7g Message-ID: References: <47CECF23.1020508@exalead.com> <47CFBF04.9030703@exalead.com> In-Reply-To: <47CFBF04.9030703@exalead.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: acceptlanguage: en-US Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Miltered: at concorde with ID 47D02BB7.000 by Joe's j-chkmail (http://j-chkmail . ensmp . fr)! X-Spam: no; 0.00; berke:01 hash-consing:01 hashtable:01 worst-case:01 hashes:01 hash:01 worst-case:01 arne:01 combinator:01 ocaml's:01 hashes:01 ocaml's:01 hash:01 ocaml:01 equality:01 Hi Berke, | However, the idea of combining hash-consing and Patricia trees, | however elegant, does not suit my problem. Basically, you are | cheating by using an auxiliary data structure, the hashtable (which is | also O(n^2) worst-case). The code I pointed you to uses hashes, but not hash tables. As for the worst-case efficiency, what you say is true, but it's unlikely to matter in practice. And I suspect it may be unavoidable in principle, which is the interesting thing I learned last time this was discussed. See for example the following paper and its references: "New Tight Bounds on Uniquely Represented Dictionaries" Arne Andersson and Thomas Ottmann SIAM J. vol 24, pp. 1091-1103, 1995. The paper is available online as http://user.it.uu.se/~arnea/ps/andetigh.ps Note in particular the following paragraph: "Hence we show that there is a huge gap between unique and non-unique representations of dictionaries. We feel that these findings point to a fundamental fact in the theory of data structures". | As I was improving my IO combinator library with sets and maps, the | structures need to be self-contained, and not need a description as a | bitstring (which could be done by using Marshal.to_string but I don't | think the performance would be there). Maybe some wizardry relying on | the physical representation of objects would permit storage of | arbitrary values in Patricia trees, but I remain skeptical. -- My code uses OCaml's polymorphic hashes internally, but these do not appear in the interface and the user doesn't need to supply anything. Indeed, you may regard OCaml's polymorphic hash as a hack, but it is available in OCaml for every type just as surely as polymorphic equality. So I'm not sure I quite understand the nature of your objection. John.