From: Manuel Fahndrich <maf@microsoft.com>
To: caml-list@pauillac.inria.fr
Subject: Alternative generic hash function
Date: Wed, 24 May 2000 15:04:11 -0700 [thread overview]
Message-ID: <783D93998201D311B0CF00805FEAA07B7E9233@RED-MSG-42> (raw)
The generic hash function Hashtbl.hash traverses into mutable data
structures. So I get different hash values depending on updates of the data
structure.
# let x = ref None;;
val x : '_a option ref = {contents=None}
# Hashtbl.hash x;;
- : int = 0
# x := Some 5;;
- : unit = ()
# Hashtbl.hash x;;
- : int = 5
As was pointed out times before on this list, Hashtbl.hash is thus
unsuitable for hashing on data structures that contain mutable data.
I'm wondering what the design rational is for this choice. I'd rather have a
generic hash function (Hashtbl.hash_pure) that never looks into mutable
structures (skips refs and mutable fields of records and classes).
The reason being that if I want to hash on data that contains references,
I'm more likely to want to find the structure independent of the contents of
any mutable subparts.
So yes, a hashtable containing int refs would pointless, because it maps
them all to the same bucket. But I frequently use hashing with equality
being ==. Whenever I do this, I need to actually add some unique identifier
to the data structure just for the hashing operation. Having an alternate
hash function (Hashtbl.hash_pure) would make such id's (and their managment)
unnecessary.
I see how the current working of Hashtbl.hash is useful for data structures
that contain mutable parts in the case where such structures don't change
between the time they are stored and later looked up in the hash table. But
I'm wondering how often this actually arises.
Furthermore, using Hashtbl.hash_pure in place of Hashtbl.hash would not
break any old code (except for performance), since all it does is make the
equivalence classes bigger. On the other hand, using Hashtbl.hash with the
misconception that it acts as Hashtbl.hash_pure breaks lots of code.
I rest my case.
Manuel Fahndrich
next reply other threads:[~2000-05-25 7:01 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2000-05-24 22:04 Manuel Fahndrich [this message]
2000-05-25 7:00 ` Pierre Weis
2000-05-25 7:14 ` STARYNKEVITCH Basile
2000-05-25 7:38 ` Dmitri Lomov
2000-05-25 8:24 ` Francois Pottier
2000-05-26 19:35 ` Dan Grossman
2000-05-29 13:13 ` Francois Pottier
2000-05-29 23:45 ` Ken Wakita
2000-05-30 17:25 ` Dan Grossman
2000-05-25 17:08 Manuel Fahndrich
2000-05-29 15:00 Damien Doligez
2000-05-30 18:04 ` Daniel Ortmann
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=783D93998201D311B0CF00805FEAA07B7E9233@RED-MSG-42 \
--to=maf@microsoft.com \
--cc=caml-list@pauillac.inria.fr \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox