From: Brian Hurt <bhurt@spnz.org>
To: Xavier Leroy <xavier.leroy@inria.fr>
Cc: Oliver Bandel <oliver@first.in-berlin.de>, <caml-list@inria.fr>
Subject: Re: [Caml-list] Should be INSIDE STANDARD-LIB: Hashtbl.keys
Date: Fri, 23 Apr 2004 11:06:06 -0500 (CDT) [thread overview]
Message-ID: <Pine.LNX.4.44.0404231054160.9460-100000@localhost.localdomain> (raw)
In-Reply-To: <20040423145141.B3686@pauillac.inria.fr>
On Fri, 23 Apr 2004, Xavier Leroy wrote:
> > I think a good addition to the Hashtbl-module
> > would be a function, that gives back a list of keys
> > that are in the hash.
>
> With your specification (no repetitions in the list), that function
> would run in quadratic time, which is a sure sign that lists aren't
> the right data structure here. (More generally speaking, "lists
> without repetitions" is almost always the wrong data structure.)
No, I think creating such a list would take O(n log n) time.
OK, we're starting with a hash table. That means we have a set of
buckets, each bucket is a set of key/data pairs. Assume the same key can
be inserted multiple times (can it?)- in this case, all duplicate keys
should be in the same bucket. So, for each bucket, I sort all entries in
the bucket by key (worst case I only have one bucket and sorting is O(n
log n)). Once sorted, I go throught and eliminate duplicates, which is
now an O(n) algorithm:
let uniq lst =
let rec loop accum = function
| [] -> List.rev accum
| x :: [] -> List.rev (x :: accum)
| x :: y :: t ->
if (x = y) then
loop accum (x :: t)
else
loop (x :: accum) (y :: t)
in
loop [] lst
;;
(You can do it more efficiently that this, but this gets the idea across)
Viola- uniqueness in subquadratic time. And in practice, approaching
linear time- hashtables with lots of elements in a single bucket are
computationally expensive, so you're likely to be sorting a whole bunch of
short (1 and 2 element) lists.
--
"Usenet is like a herd of performing elephants with diarrhea -- massive,
difficult to redirect, awe-inspiring, entertaining, and a source of
mind-boggling amounts of excrement when you least expect it."
- Gene Spafford
Brian
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
next prev parent reply other threads:[~2004-04-23 16:01 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2004-04-21 1:19 Oliver Bandel
2004-04-21 8:39 ` Richard Jones
2004-04-21 9:13 ` Martin Jambon
2004-04-23 12:51 ` Xavier Leroy
2004-04-23 13:05 ` Jean-Baptiste Rouquier
2004-04-23 16:04 ` Xavier Leroy
2004-04-23 18:21 ` Jon Harrop
2004-04-23 21:31 ` Jon Harrop
2004-04-23 21:53 ` John Goerzen
2004-04-26 6:28 ` Florian Hars
2004-04-23 18:29 ` John Goerzen
[not found] ` <20040423190710.GA1506@first.in-berlin.de>
2004-04-23 20:42 ` John Goerzen
2004-04-23 15:03 ` Richard Jones
2004-04-24 1:58 ` skaller
2004-04-24 9:20 ` Nicolas Cannasse
2004-04-24 19:26 ` skaller
2004-04-26 7:29 ` Jean-Christophe Filliatre
2004-04-23 16:06 ` Brian Hurt [this message]
2004-04-23 16:31 ` Martin Jambon
2004-04-23 17:27 ` Christoph Bauer
2004-04-23 18:29 ` John Goerzen
[not found] ` <20040423191010.GB1506@first.in-berlin.de>
2004-04-23 20:41 ` John Goerzen
[not found] ` <20040424080904.GA821@first.in-berlin.de>
2004-04-24 20:59 ` John Goerzen
2004-04-25 8:12 ` Oliver Bandel
2004-04-23 18:28 ` John Goerzen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Pine.LNX.4.44.0404231054160.9460-100000@localhost.localdomain \
--to=bhurt@spnz.org \
--cc=caml-list@inria.fr \
--cc=oliver@first.in-berlin.de \
--cc=xavier.leroy@inria.fr \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox