* [Caml-list] some Hashtbl observations @ 2002-02-03 19:25 Doug Bagley 2002-02-04 7:57 ` Michel Quercia 2002-02-04 8:52 ` Jean-Christophe Filliatre 0 siblings, 2 replies; 3+ messages in thread From: Doug Bagley @ 2002-02-03 19:25 UTC (permalink / raw) To: caml-list I recently had occasion to create a hash table to remember some Num's I was generating. Of course, I didn't figure out right away that you can't directly use the generic interface for Hashtbl with the type "Num". Well, actually, you *can* but you won't find out you have made an error until you are already a good way into your work: # #load "nums.cma";; # let h = Hashtbl.create 10;; val h : ('_a, '_b) Hashtbl.t = <abstr> # let one = Num.num_of_int 1;; val one : Num.num = Num.Int 1 # let nine_tenths = Num.num_of_string "9/10";; val nine_tenths : Num.num = Num.Ratio <abstr> # let ten_ninths = Num.num_of_string "10/9";; val ten_ninths : Num.num = Num.Ratio <abstr> # let remember k = try Hashtbl.find h k with Not_found -> Hashtbl.add h k ();; val remember : ('a, unit) Hashtbl.t -> 'a -> unit = <fun> # remember one;; - : unit = () # remember nine_tenths;; - : unit = () # remember ten_ninths;; Exception: Failure "equal: abstract value". D'oh! So I guess I shouldn't use an <abstr> type as a hash key in the generic Hashtbl interface. That makes sense in hindsight. Well, I went off to try to create my own custom hash with the functorial interface. The invocation was a bit of a mystery to me, but I eventually found an example and got that to work. But it seemed pretty slow. I tried a couple different hashing functions, one was just: let hash_of_num n = Hashtbl.hash (float_of_num n);; With the other I tried converting the numer/denom to ints, and shifting the numerator left 15 bits. In any case, it seemed like I ended up with pretty slow find/add operations. (Also, I was using eq_num as the "equal" function). I realized that since I was only using Num's to represent small rational numbers that I could switch back to the generic Hashtbl interface and use as a key a tuple of ints to represent (numerator, denominator). That turned out faster than my custom hash (by about an order of magnitude for my program), and the generic Hashtbl interface seemed happy with it. (I also saved the actual Num value along with each key generated so I didn't have to do any backwards conversion). I sat and pondered why OCaml couldn't tell me earlier on why there was a problem storing Num's via the generic interface. I think I figured it out. Some operations require the application of (=) to the keys, and unless you hit that operation on keys that are =-wise incompatible, you won't see the exception above. e.g.: # one = one;; - : bool = true # one = ten_ninths;; - : bool = false # nine_tenths = ten_ninths;; Exception: Failure "equal: abstract value". Well, that makes sense. I wondered if the type system could be used to check earlier on if the key type has the property of being =-compatible, but am I right in surmising that if so, it would require changes to both Hashtbl and Num? I suppose that may not be practical. Maybe Hashtbl's behavior for keys of abstract type could be more explicitly documented? It might help some other newbies. Anyway, that's my trip report from the land of Hashtbl. If anyone can suggest ways I could have made life easier for myself, please let me know. Thanks. cheers, doug ------------------- Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [Caml-list] some Hashtbl observations 2002-02-03 19:25 [Caml-list] some Hashtbl observations Doug Bagley @ 2002-02-04 7:57 ` Michel Quercia 2002-02-04 8:52 ` Jean-Christophe Filliatre 1 sibling, 0 replies; 3+ messages in thread From: Michel Quercia @ 2002-02-04 7:57 UTC (permalink / raw) To: Doug Bagley; +Cc: caml-list Le Sun, 3 Feb 2002 13:25:17 -0600 doug@bagley.org (Doug Bagley) écrivit : > I recently had occasion to create a hash table to remember some Num's > I was generating. Of course, I didn't figure out right away that you > can't directly use the generic interface for Hashtbl with the type > "Num". > ... > Exception: Failure "equal: abstract value". I ran into this problem when developping the Bigint interface for Numerix. You'll find in numerix-0.19 (http://pauillac.inria.fr/~quercia/cdrom/bibs/numerix-0.19a.tar.gz) a work around implementing the missing functions (generic compare and generic hash) for nats. Below is the patch file used in numerix-0.19 : ---------------------- cut here ---------------------------------- *** otherlibs/num/nat_stubs.c.orig Fri Aug 24 15:08:51 2001 --- otherlibs/num/nat_stubs.c Fri Aug 24 15:46:58 2001 *************** *** 11,16 **** --- 11,18 ---- /***********************************************************************/ /* $Id: nat_stubs.c,v 1.10 2000/04/20 16:20:59 xleroy Exp $ */ + /* Modified by M.Quercia on August 24th, 2001 for generic compare and hash */+ /* All modified lines end with the comment MQ */ #define CAML_LIGHT #include "alloc.h" *************** *** 28,39 **** static void serialize_nat(value, unsigned long *, unsigned long *); static unsigned long deserialize_nat(void * dst); static struct custom_operations nat_operations = { "_nat", custom_finalize_default, ! custom_compare_default, ! custom_hash_default, serialize_nat, deserialize_nat }; --- 30,43 ---- static void serialize_nat(value, unsigned long *, unsigned long *); static unsigned long deserialize_nat(void * dst); + static int nat_compare(value a, value b); /* MQ */ + static long nat_hash(value a); /* MQ */ static struct custom_operations nat_operations = { "_nat", custom_finalize_default, ! nat_compare, /* custom_compare_default, MQ */ ! nat_hash, /* custom_hash_default, MQ */ serialize_nat, deserialize_nat }; *************** *** 331,333 **** --- 335,350 ---- return len * 4; } + /* custom comparison MQ */ + static int nat_compare(value a, value b) { /* MQ */ + return (BnnCompare(Bignum_val(a), Wosize_val(a) - 1, /* MQ */ + Bignum_val(b), Wosize_val(b) - 1)); /* MQ */ + } /* MQ */ + + /* hashing, borrowed from ocaml/byterun/hash.c MQ */ + static long nat_hash(value a) { /* MQ */ + unsigned long l = Wosize_val(a) - 1, accu = l, i; /* MQ */ + for (i=0; i<l; i++) /* MQ */ + accu = accu*65599 + BnGetDigit(Bignum_val(a),i); /* MQ */ + return(accu); /* MQ */ + } /* MQ */ ---------------------- cut here ---------------------------------- Refer to the Numerix doc (file doc/numerix-eng.ps, page 42) for instructions. Regards, -- Michel Quercia 57 rue abbé Grégoire, 38000 Grenoble http://michel.quercia.free.fr (maths) http://pauillac.inria.fr/~quercia (informatique) mailto:michel.quercia@prepas.org ------------------- Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [Caml-list] some Hashtbl observations 2002-02-03 19:25 [Caml-list] some Hashtbl observations Doug Bagley 2002-02-04 7:57 ` Michel Quercia @ 2002-02-04 8:52 ` Jean-Christophe Filliatre 1 sibling, 0 replies; 3+ messages in thread From: Jean-Christophe Filliatre @ 2002-02-04 8:52 UTC (permalink / raw) To: Doug Bagley; +Cc: caml-list > D'oh! So I guess I shouldn't use an <abstr> type as a hash key in the > generic Hashtbl interface. That makes sense in hindsight. There is actually another issue related to ocaml bignums related to generic hash and equalit functions: the lack of unique representation. Indeed, type "num" is defined as ====================================================================== type num = Int of int | Big_int of Big_int.big_int | Ratio of Ratio.ratio ====================================================================== (to improve computations over small integers or big integers not being rationals) but a given rational may have several representations (e.g. 1 may be (Int 1) sometimes and some ratio (Ratio ...) later). Thus, even if hash and equality would not fail on the abstract type ratio, they would fail to hash or compare properly two different representations of the same number (generic hash and equality functions are---among other things---based on constructors tags). Hope this helps, -- Jean-Christophe Filliatre (http://www.lri.fr/~filliatr) ------------------- Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr ^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2002-02-04 8:52 UTC | newest] Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2002-02-03 19:25 [Caml-list] some Hashtbl observations Doug Bagley 2002-02-04 7:57 ` Michel Quercia 2002-02-04 8:52 ` Jean-Christophe Filliatre
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox