* Canonical Set/Map datastructure? @ 2008-03-05 16:49 Berke Durak 2008-03-05 17:16 ` [Caml-list] " Brian Hurt ` (3 more replies) 0 siblings, 4 replies; 13+ messages in thread From: Berke Durak @ 2008-03-05 16:49 UTC (permalink / raw) To: Caml-list List The Map and Set modules use AVL trees which are efficient but not canonical - a given set of elements can have more than one representation. This means that you cannot use ad hoc comparison on sets and maps, and this is why they are presented as functors. Does anyone know if, in the many years that have passed since the implementation of those fine modules, someone has invented a (functional) datastructure that is as efficient while being canonic? That would permit polymorphic set and map modules that work correctly on sets of sets, for instance. Of course, the order induced on sets by the adhoc comparison doesn't have to be a useful order; just being a good order would suffice. -- Berke DURAK ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Caml-list] Canonical Set/Map datastructure? 2008-03-05 16:49 Canonical Set/Map datastructure? Berke Durak @ 2008-03-05 17:16 ` Brian Hurt 2008-03-05 17:27 ` Alain Frisch ` (2 subsequent siblings) 3 siblings, 0 replies; 13+ messages in thread From: Brian Hurt @ 2008-03-05 17:16 UTC (permalink / raw) To: Berke Durak; +Cc: Caml-list List Berke Durak wrote: > The Map and Set modules use AVL trees which are efficient but not > canonical - a given > set of elements can have more than one representation. This means > that you cannot use > ad hoc comparison on sets and maps, and this is why they are presented > as functors. However, as you can walk the tree in O(N), it's still possible to do set/map compare in O(N) worst case. All this means is that Pervasives.compare is not equivalent to Set.compare. > > Does anyone know if, in the many years that have passed since the > implementation of > those fine modules, someone has invented a (functional) datastructure > that is as > efficient while being canonic? The preserves "fast" (O(log N)) insert and removal? No. If you're willing to accept O(N) insert/removal cost, simple sorted lists work. I'm not sure if it's possible to have both fast insert/removal and a canonical form. Brian ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Caml-list] Canonical Set/Map datastructure? 2008-03-05 16:49 Canonical Set/Map datastructure? Berke Durak 2008-03-05 17:16 ` [Caml-list] " Brian Hurt @ 2008-03-05 17:27 ` Alain Frisch 2008-03-05 19:53 ` Jean-Christophe Filliâtre 2008-03-05 20:03 ` Jon Harrop 2008-03-05 17:34 ` Harrison, John R 2008-03-06 9:53 ` Berke Durak 3 siblings, 2 replies; 13+ messages in thread From: Alain Frisch @ 2008-03-05 17:27 UTC (permalink / raw) To: Berke Durak; +Cc: caml-list Berke Durak wrote: > The Map and Set modules use AVL trees which are efficient but not > canonical - a given > set of elements can have more than one representation. This means that > you cannot use > ad hoc comparison on sets and maps, and this is why they are presented > as functors. > > Does anyone know if, in the many years that have passed since the > implementation of > those fine modules, someone has invented a (functional) datastructure > that is as > efficient while being canonic? Well, Patricia trees have been around for many years and they satisfy this property. They also allow set operations (union, intersection, ...) in linear time (and I explain below how this can be optimized to something which is really efficient for some applications). Jean-Christophe Filliâtre has an implementation on its web page. Patricia trees work fine when the set elements can easily be represented as strings of bits. So if you can map your elements to integers, that's ok. Otherwise, you can hash-cons your elements to get unique integers for them. Something that Jean-Christophe's implementation doesn't do but which is quite easy to add is to use hash-consing on patricia trees themselves, that is, to memoize their constructors in order to get unique physical representation and maximal sharing. That way, you get: structural equality = physical equality = set equality With this property, set operations on patricia trees can be optimized with reflexivity properties (e.g. the inner loop of the union function can start by checking equality of its arguments). Also, you get a nice unique integer for each tree. This allow you to memoize efficiently set operations (like union, intersection, for which you can use memoization in the inner loop, not only at toplevel), and to build sets of sets (and so on). -- Alain ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Caml-list] Canonical Set/Map datastructure? 2008-03-05 17:27 ` Alain Frisch @ 2008-03-05 19:53 ` Jean-Christophe Filliâtre 2008-03-05 20:03 ` Jon Harrop 1 sibling, 0 replies; 13+ messages in thread From: Jean-Christophe Filliâtre @ 2008-03-05 19:53 UTC (permalink / raw) To: Alain Frisch; +Cc: Berke Durak, caml-list Alain Frisch a écrit : > Something that Jean-Christophe's implementation doesn't do but which is > quite easy to add is to use hash-consing on patricia trees themselves, This is a nice idea, thanks; I will eventually add this to my implementation. Meanwhile, I can mention that there is also an hash-consing library on my web page (together with a paper describing the technique) and thus it should be as ``simple'' as combining the two codes :-) -- Jean-Christophe Filliâtre http://www.lri.fr/~filliatr/ ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Caml-list] Canonical Set/Map datastructure? 2008-03-05 17:27 ` Alain Frisch 2008-03-05 19:53 ` Jean-Christophe Filliâtre @ 2008-03-05 20:03 ` Jon Harrop 2008-03-05 21:56 ` Alain Frisch 2008-03-06 7:45 ` Jean-Christophe Filliâtre 1 sibling, 2 replies; 13+ messages in thread From: Jon Harrop @ 2008-03-05 20:03 UTC (permalink / raw) To: caml-list On Wednesday 05 March 2008 17:27:38 Alain Frisch wrote: > Patricia trees work fine when the set elements can easily be represented > as strings of bits. > ... > structural equality = physical equality = set equality This is very interesting. What are the disadvantages? I'm guessing: slow value creation for small values and heavy memory consumption. -- Dr Jon D Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/products/?e ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Caml-list] Canonical Set/Map datastructure? 2008-03-05 20:03 ` Jon Harrop @ 2008-03-05 21:56 ` Alain Frisch 2008-03-06 7:45 ` Jean-Christophe Filliâtre 1 sibling, 0 replies; 13+ messages in thread From: Alain Frisch @ 2008-03-05 21:56 UTC (permalink / raw) To: Jon Harrop; +Cc: caml-list Jon Harrop wrote: > On Wednesday 05 March 2008 17:27:38 Alain Frisch wrote: >> Patricia trees work fine when the set elements can easily be represented >> as strings of bits. >> ... >> structural equality = physical equality = set equality > > This is very interesting. What are the disadvantages? I'm guessing: slow value > creation for small values and heavy memory consumption. I haven't done any serious benchmark comparing Patricia trees and OCaml's Set module. For the situations where we're using hash-consed Patricia trees, it is simply not an option to use balanced trees (that would turn almost linear algorithms into quadratic-or-more ones). Of course, hash-consing (and memoization of set operations) means extra tables and lookups, so there will be some overhead. Last time I checked, it was much more efficient to use regular hash tables rather than weak ones, but this might have changed since then. Another disadvantage is that hash-consing usually does not play well with marshaling (e.g. you need to manually traverse unmarshaled values to restore invariants). Alain ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Caml-list] Canonical Set/Map datastructure? 2008-03-05 20:03 ` Jon Harrop 2008-03-05 21:56 ` Alain Frisch @ 2008-03-06 7:45 ` Jean-Christophe Filliâtre 1 sibling, 0 replies; 13+ messages in thread From: Jean-Christophe Filliâtre @ 2008-03-06 7:45 UTC (permalink / raw) To: Jon Harrop; +Cc: caml-list Jon Harrop wrote: > On Wednesday 05 March 2008 17:27:38 Alain Frisch wrote: >> Patricia trees work fine when the set elements can easily be represented >> as strings of bits. >> ... >> structural equality = physical equality = set equality > > This is very interesting. What are the disadvantages? I'm guessing: slow value > creation for small values and heavy memory consumption. Regarding memory consumpion, it is true that hash-consing tables are using memory, but sharing structurally equal values may also save some space; see for instance the benchmarks on page 4 of http://www.lri.fr/~filliatr/ftp/publis/hash-consing2.ps -- Jean-Christophe ^ permalink raw reply [flat|nested] 13+ messages in thread
* RE: [Caml-list] Canonical Set/Map datastructure? 2008-03-05 16:49 Canonical Set/Map datastructure? Berke Durak 2008-03-05 17:16 ` [Caml-list] " Brian Hurt 2008-03-05 17:27 ` Alain Frisch @ 2008-03-05 17:34 ` Harrison, John R 2008-03-06 9:53 ` Berke Durak 3 siblings, 0 replies; 13+ messages in thread From: Harrison, John R @ 2008-03-05 17:34 UTC (permalink / raw) To: Berke Durak, Caml-list List Hi Berke, | Does anyone know if, in the many years that have passed since the | implementation of those fine modules, someone has invented a | (functional) datastructure that is as efficient while being | canonic? I have an implementation, which you're welcome to use/adapt. It's certainly not particularly well optimized, but I've relied on it for a few years with no problems. The code is in this file: http://www.cl.cam.ac.uk/~jrh13/atp/OCaml/lib.ml Search for "Diego" and the relevant stuff starts there; a few earlier functions (like "uniq") are used, but they are easy to copy or replace. This was the outcome of a similar question I asked on the OCaml list a few years ago. Diego Olivier Fernandez Pons not only pointed me at some interesting theoretical results, but also suggested an idea using hashes and Patricia trees that works very well in practice. That's what I implemented. My implementation is for maps only, but it's easy to adapt for sets. Or you could just use maps to type "unit" if you're lazy like me. John. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Caml-list] Canonical Set/Map datastructure? 2008-03-05 16:49 Canonical Set/Map datastructure? Berke Durak ` (2 preceding siblings ...) 2008-03-05 17:34 ` Harrison, John R @ 2008-03-06 9:53 ` Berke Durak 2008-03-06 17:36 ` Harrison, John R 2008-03-07 10:19 ` Alain Frisch 3 siblings, 2 replies; 13+ messages in thread From: Berke Durak @ 2008-03-06 9:53 UTC (permalink / raw) To: Berke Durak; +Cc: Caml-list List Berke Durak a écrit : > The Map and Set modules use AVL trees which are efficient but not > canonical - a given > set of elements can have more than one representation. This means that > you cannot use > ad hoc comparison on sets and maps, and this is why they are presented > as functors. > > Does anyone know if, in the many years that have passed since the > implementation of > those fine modules, someone has invented a (functional) datastructure > that is as > efficient while being canonic? > > That would permit polymorphic set and map modules that work correctly on > sets of sets, for > instance. Of course, the order induced on sets by the adhoc comparison > doesn't have to > be a useful order; just being a good order would suffice. Thanks for all your replies. I did not know that Patricia trees were canonical. However, the idea of combining hash-consing and Patricia trees, however elegant, does not suit my problem. Basically, you are cheating by using an auxiliary data structure, the hashtable (which is also O(n^2) worst-case). As I was improving my IO combinator library with sets and maps, the structures need to be self-contained, and not need a description as a bitstring (which could be done by using Marshal.to_string but I don't think the performance would be there). Maybe some wizardry relying on the physical representation of objects would permit storage of arbitrary values in Patricia trees, but I remain skeptical. -- Berke DURAK ^ permalink raw reply [flat|nested] 13+ messages in thread
* RE: [Caml-list] Canonical Set/Map datastructure? 2008-03-06 9:53 ` Berke Durak @ 2008-03-06 17:36 ` Harrison, John R 2008-03-07 10:09 ` Berke Durak 2008-03-07 10:19 ` Alain Frisch 1 sibling, 1 reply; 13+ messages in thread From: Harrison, John R @ 2008-03-06 17:36 UTC (permalink / raw) To: Berke Durak; +Cc: Caml-list List Hi Berke, | However, the idea of combining hash-consing and Patricia trees, | however elegant, does not suit my problem. Basically, you are | cheating by using an auxiliary data structure, the hashtable (which is | also O(n^2) worst-case). The code I pointed you to uses hashes, but not hash tables. As for the worst-case efficiency, what you say is true, but it's unlikely to matter in practice. And I suspect it may be unavoidable in principle, which is the interesting thing I learned last time this was discussed. See for example the following paper and its references: "New Tight Bounds on Uniquely Represented Dictionaries" Arne Andersson and Thomas Ottmann SIAM J. vol 24, pp. 1091-1103, 1995. The paper is available online as http://user.it.uu.se/~arnea/ps/andetigh.ps Note in particular the following paragraph: "Hence we show that there is a huge gap between unique and non-unique representations of dictionaries. We feel that these findings point to a fundamental fact in the theory of data structures". | As I was improving my IO combinator library with sets and maps, the | structures need to be self-contained, and not need a description as a | bitstring (which could be done by using Marshal.to_string but I don't | think the performance would be there). Maybe some wizardry relying on | the physical representation of objects would permit storage of | arbitrary values in Patricia trees, but I remain skeptical. -- My code uses OCaml's polymorphic hashes internally, but these do not appear in the interface and the user doesn't need to supply anything. Indeed, you may regard OCaml's polymorphic hash as a hack, but it is available in OCaml for every type just as surely as polymorphic equality. So I'm not sure I quite understand the nature of your objection. John. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Caml-list] Canonical Set/Map datastructure? 2008-03-06 17:36 ` Harrison, John R @ 2008-03-07 10:09 ` Berke Durak 2008-03-07 17:13 ` Harrison, John R 0 siblings, 1 reply; 13+ messages in thread From: Berke Durak @ 2008-03-07 10:09 UTC (permalink / raw) To: Harrison, John R; +Cc: Caml-list List Harrison, John R a écrit : > Hi Berke, > > | However, the idea of combining hash-consing and Patricia trees, > | however elegant, does not suit my problem. Basically, you are > | cheating by using an auxiliary data structure, the hashtable (which is > | also O(n^2) worst-case). > > The code I pointed you to uses hashes, but not hash tables. As for the > worst-case efficiency, what you say is true, but it's unlikely to > matter in practice. And I suspect it may be unavoidable in principle, > which is the interesting thing I learned last time this was discussed. > See for example the following paper and its references: Hi John, Thanks for your code. I'm sorry I thought you maintained a separate hash table. It's very interesting code and I'll give it a try. But it seems to have two properties which make it unsuitable for the particular application I had in mind: - The theoretical worst-case performance of hash-based data structures can be attained if the hash function has bad dispersal. As the code relies on Hashtbl.hash, which does not hash deeply, this could potentially be a proble, in particular if your keys have long "common prefixes" that are not distinguished by the hash function. It might work well in practice but I feel a little uncomfortable. - Also, it does not preserve the natural order for keys, and that is particularly bad, because I often use, for instance, float-indexed maps or sets as a substitute for heaps. The paper with the inverse cubic lower bound is *very* interesting; we don't see plausible lower bounds often in complexity theory, especially with such large assumptions (just bounded out-degree for the graph nodes). So it seems there is little hope to have a drop-in replacement for Set or Map that is compatible with the natural order of things, a.k.a., Pervasives.compare. -- Berke DURAK ^ permalink raw reply [flat|nested] 13+ messages in thread
* RE: [Caml-list] Canonical Set/Map datastructure? 2008-03-07 10:09 ` Berke Durak @ 2008-03-07 17:13 ` Harrison, John R 0 siblings, 0 replies; 13+ messages in thread From: Harrison, John R @ 2008-03-07 17:13 UTC (permalink / raw) To: Berke Durak; +Cc: Caml-list List [-- Attachment #1: Type: text/plain, Size: 2308 bytes --] Hi Berke, | Thanks for your code. I'm sorry I thought you maintained a separate | hash table. It's very interesting code and I'll give it a try. It would in fact have been more efficient to use Jean-Christophe Filliatre's implementation of Patricia trees instead of writing my own. However, it was important to me to release the code under a BSD-like license. For what it's worth, I attach a version of my code that's extracted from its context so it can be used independently. (I replaced a few of my pet functions with standard library ones, and I hope I didn't screw anything up in the process.) | - The theoretical worst-case performance of hash-based data structures | can be attained if the hash function has bad dispersal. As the code | relies on Hashtbl.hash, which does not hash deeply, this could | potentially be a proble, in particular if your keys have long "common | prefixes" that are not distinguished by the hash function. It might | work well in practice but I feel a little uncomfortable. Yes, sure. My applications are mainly in theorem proving and symbolic computation where this isn't an issue, and I can imagine it might not be suitable elsewhere. | - Also, it does not preserve the natural order for keys, and that | is particularly bad, because I often use, for instance, | float-indexed maps or sets as a substitute for heaps. When I was looking at this area last time (maybe just following the references from the paper I cited) I came across a kind of mixed heap/tree structure with distinct "horizontal" and "vertical" orderings. That might be something to consider. But once again there is a bad worst-case performance if the two orders happen to be correlated. | The paper with the inverse cubic lower bound is *very* interesting; we | don't see plausible lower bounds often in complexity theory, | especially with such large assumptions (just bounded out-degree for | the graph nodes). | | So it seems there is little hope to have a drop-in replacement for Set | or Map that is compatible with the natural order of things, a.k.a., | Pervasives.compare. Yes, I really found it striking that in a fundamental sense, you need to pay the price of noncanonicality in order to get the guaranteed O(log n) lookup. John. [-- Attachment #2: map.ml --] [-- Type: application/octet-stream, Size: 10766 bytes --] (* ------------------------------------------------------------------------- *) (* Polymorphic finite partial functions via Patricia trees. *) (* *) (* The point of this strange representation is that it is canonical (equal *) (* functions have the same encoding) yet reasonably efficient on average. *) (* *) (* (c) John Harrison 2003. *) (* *) (* See http://www.cl.cam.ac.uk/~jrh13/atp/OCaml/LICENSE.txt *) (* *) (* Idea due to Diego Olivier Fernandez Pons (OCaml list, 2003/11/10). *) (* ------------------------------------------------------------------------- *) type ('a,'b)func = Empty | Leaf of int * ('a*'b)list | Branch of int * int * ('a,'b)func * ('a,'b)func;; (* ------------------------------------------------------------------------- *) (* Undefined function. *) (* ------------------------------------------------------------------------- *) let undefined = Empty;; (* ------------------------------------------------------------------------- *) (* In case of equality comparison worries, better use this. *) (* ------------------------------------------------------------------------- *) let is_undefined f = match f with Empty -> true | _ -> false;; (* ------------------------------------------------------------------------- *) (* Operation analogous to "map" for lists. *) (* ------------------------------------------------------------------------- *) let mapf = let rec map_list f l = match l with [] -> [] | (x,y)::t -> (x,f(y))::(map_list f t) in let rec mapf f t = match t with Empty -> Empty | Leaf(h,l) -> Leaf(h,map_list f l) | Branch(p,b,l,r) -> Branch(p,b,mapf f l,mapf f r) in mapf;; (* ------------------------------------------------------------------------- *) (* Operations analogous to "fold" for lists. *) (* ------------------------------------------------------------------------- *) let foldl = let rec foldl_list f a l = match l with [] -> a | (x,y)::t -> foldl_list f (f a x y) t in let rec foldl f a t = match t with Empty -> a | Leaf(h,l) -> foldl_list f a l | Branch(p,b,l,r) -> foldl f (foldl f a l) r in foldl;; let foldr = let rec foldr_list f l a = match l with [] -> a | (x,y)::t -> f x y (foldr_list f t a) in let rec foldr f t a = match t with Empty -> a | Leaf(h,l) -> foldr_list f l a | Branch(p,b,l,r) -> foldr f l (foldr f r a) in foldr;; (* ------------------------------------------------------------------------- *) (* Application. *) (* ------------------------------------------------------------------------- *) let applyd = let rec apply_listd l d x = match l with (a,b)::t -> let c = Pervasives.compare x a in if c = 0 then b else if c > 0 then apply_listd t d x else d x | [] -> d x in fun f d x -> let k = Hashtbl.hash x in let rec look t = match t with Leaf(h,l) when h = k -> apply_listd l d x | Branch(p,b,l,r) when (k lxor p) land (b - 1) = 0 -> look (if k land b = 0 then l else r) | _ -> d x in look f;; let apply f = applyd f (fun x -> failwith "apply");; let tryapplyd f a d = applyd f (fun x -> d) a;; let tryapplyl f x = tryapplyd f x [];; let defined f x = try apply f x; true with Failure _ -> false;; (* ------------------------------------------------------------------------- *) (* Undefinition. *) (* ------------------------------------------------------------------------- *) let undefine = let rec undefine_list x l = match l with (a,b as ab)::t -> let c = Pervasives.compare x a in if c = 0 then t else if c < 0 then l else let t' = undefine_list x t in if t' == t then l else ab::t' | [] -> [] in fun x -> let k = Hashtbl.hash x in let rec und t = match t with Leaf(h,l) when h = k -> let l' = undefine_list x l in if l' == l then t else if l' = [] then Empty else Leaf(h,l') | Branch(p,b,l,r) when k land (b - 1) = p -> if k land b = 0 then let l' = und l in if l' == l then t else if is_undefined l' then r else Branch(p,b,l',r) else let r' = und r in if r' == r then t else if is_undefined r' then l else Branch(p,b,l,r') | _ -> t in und;; (* ------------------------------------------------------------------------- *) (* Redefinition and combination. *) (* ------------------------------------------------------------------------- *) let (|->),combine = let ldb x y = let z = x lxor y in z land (-z) in let newbranch p1 t1 p2 t2 = let b = ldb p1 p2 in let p = p1 land (b - 1) in if p1 land b = 0 then Branch(p,b,t1,t2) else Branch(p,b,t2,t1) in let rec define_list (x,y as xy) l = match l with (a,b as ab)::t -> let c = Pervasives.compare x a in if c = 0 then xy::t else if c < 0 then xy::l else ab::(define_list xy t) | [] -> [xy] and combine_list op z l1 l2 = match (l1,l2) with [],_ -> l2 | _,[] -> l1 | ((x1,y1 as xy1)::t1,(x2,y2 as xy2)::t2) -> let c = Pervasives.compare x1 x2 in if c < 0 then xy1::(combine_list op z t1 l2) else if c > 0 then xy2::(combine_list op z l1 t2) else let y = op y1 y2 and l = combine_list op z t1 t2 in if z(y) then l else (x1,y)::l in let (|->) x y = let k = Hashtbl.hash x in let rec upd t = match t with Empty -> Leaf (k,[x,y]) | Leaf(h,l) -> if h = k then Leaf(h,define_list (x,y) l) else newbranch h t k (Leaf(k,[x,y])) | Branch(p,b,l,r) -> if k land (b - 1) <> p then newbranch p t k (Leaf(k,[x,y])) else if k land b = 0 then Branch(p,b,upd l,r) else Branch(p,b,l,upd r) in upd in let rec combine op z t1 t2 = match (t1,t2) with Empty,_ -> t2 | _,Empty -> t1 | Leaf(h1,l1),Leaf(h2,l2) -> if h1 = h2 then let l = combine_list op z l1 l2 in if l = [] then Empty else Leaf(h1,l) else newbranch h1 t1 h2 t2 | (Leaf(k,lis) as lf),(Branch(p,b,l,r) as br) -> if k land (b - 1) = p then if k land b = 0 then let l' = combine op z lf l in if is_undefined l' then r else Branch(p,b,l',r) else let r' = combine op z lf r in if is_undefined r' then l else Branch(p,b,l,r') else newbranch k lf p br | (Branch(p,b,l,r) as br),(Leaf(k,lis) as lf) -> if k land (b - 1) = p then if k land b = 0 then let l' = combine op z l lf in if is_undefined l' then r else Branch(p,b,l',r) else let r' = combine op z r lf in if is_undefined r' then l else Branch(p,b,l,r') else newbranch p br k lf | Branch(p1,b1,l1,r1),Branch(p2,b2,l2,r2) -> if b1 < b2 then if p2 land (b1 - 1) <> p1 then newbranch p1 t1 p2 t2 else if p2 land b1 = 0 then let l = combine op z l1 t2 in if is_undefined l then r1 else Branch(p1,b1,l,r1) else let r = combine op z r1 t2 in if is_undefined r then l1 else Branch(p1,b1,l1,r) else if b2 < b1 then if p1 land (b2 - 1) <> p2 then newbranch p1 t1 p2 t2 else if p1 land b2 = 0 then let l = combine op z t1 l2 in if is_undefined l then r2 else Branch(p2,b2,l,r2) else let r = combine op z t1 r2 in if is_undefined r then l2 else Branch(p2,b2,l2,r) else if p1 = p2 then let l = combine op z l1 l2 and r = combine op z r1 r2 in if is_undefined l then r else if is_undefined r then l else Branch(p1,b1,l,r) else newbranch p1 t1 p2 t2 in (|->),combine;; (* ------------------------------------------------------------------------- *) (* Special case of point function. *) (* ------------------------------------------------------------------------- *) let (|=>) = fun x y -> (x |-> y) undefined;; (* ------------------------------------------------------------------------- *) (* Grab an arbitrary element. *) (* ------------------------------------------------------------------------- *) let rec choose t = match t with Empty -> failwith "choose: completely undefined function" | Leaf(h,l) -> List.hd l | Branch(b,p,t1,t2) -> choose t1;; (* ------------------------------------------------------------------------- *) (* Mapping to sorted-list representation of the graph, domain and range. *) (* ------------------------------------------------------------------------- *) let rec uniq l = match l with x::(y::_ as t) -> let t' = uniq t in if Pervasives.compare x y = 0 then t' else if t'==t then l else x::t' | _ -> l;; let setify l = uniq(List.sort Pervasives.compare l);; let graph f = setify (foldl (fun a x y -> (x,y)::a) [] f);; let dom f = setify(foldl (fun a x y -> x::a) [] f);; let ran f = setify(foldl (fun a x y -> y::a) [] f);; (* ------------------------------------------------------------------------- *) (* Idiom for a mapping zipping domain and range lists. *) (* ------------------------------------------------------------------------- *) let fpf xs ys = List.fold_right2 (|->) xs ys undefined;; (* ------------------------------------------------------------------------- *) (* Install a (trivial) printer for finite partial functions. *) (* ------------------------------------------------------------------------- *) let print_fpf (f:('a,'b)func) = print_string "<func>";; #install_printer print_fpf;; ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Caml-list] Canonical Set/Map datastructure? 2008-03-06 9:53 ` Berke Durak 2008-03-06 17:36 ` Harrison, John R @ 2008-03-07 10:19 ` Alain Frisch 1 sibling, 0 replies; 13+ messages in thread From: Alain Frisch @ 2008-03-07 10:19 UTC (permalink / raw) To: Berke Durak; +Cc: caml-list Berke Durak wrote: > However, the idea of combining hash-consing and Patricia trees, however > elegant, does not suit my problem. Basically, you are cheating by using > an auxiliary data structure, the hashtable (which is also O(n^2) > worst-case). You can also implement hash-consing with the Map module (or a polymorphic version of it) to avoid the worst case complexity. But yes, you need an extra data structure. -- Alain ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2008-03-07 17:13 UTC | newest] Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2008-03-05 16:49 Canonical Set/Map datastructure? Berke Durak 2008-03-05 17:16 ` [Caml-list] " Brian Hurt 2008-03-05 17:27 ` Alain Frisch 2008-03-05 19:53 ` Jean-Christophe Filliâtre 2008-03-05 20:03 ` Jon Harrop 2008-03-05 21:56 ` Alain Frisch 2008-03-06 7:45 ` Jean-Christophe Filliâtre 2008-03-05 17:34 ` Harrison, John R 2008-03-06 9:53 ` Berke Durak 2008-03-06 17:36 ` Harrison, John R 2008-03-07 10:09 ` Berke Durak 2008-03-07 17:13 ` Harrison, John R 2008-03-07 10:19 ` Alain Frisch
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox