Mailing list for all users of the OCaml language and system.
 help / color / mirror / Atom feed
From: oliver@first.in-berlin.de (Oliver Bandel)
To: caml-list@inria.fr
Subject: [oliver: Re: [Caml-list] Should be INSIDE STANDARD-LIB: Hashtbl.keys]
Date: Fri, 23 Apr 2004 22:09:23 +0200	[thread overview]
Message-ID: <20040423200923.GA271@first.in-berlin.de> (raw)

----- Forwarded message from oliver -----

To: Xavier Leroy <xavier.leroy@inria.fr>
Subject: Re: [Caml-list] Should be INSIDE STANDARD-LIB: Hashtbl.keys

On Fri, Apr 23, 2004 at 06:04:07PM +0200, Xavier Leroy wrote:
> > Perhaps I'm all wrong, but when I have to get rid of repetitions in a 
> > list, I first sort it in O(n log n), then remove the repetitions in O(n).
> > Jean-Baptiste.
> 
> OK, fair enough :-)  The point I was trying to make (not very well, I
> agree) is that "list without repetition" or "sorted list without
> repetition" is often not the best data structure for the job.
[...]

NOT the datastructure FOR THE JOB BUT FOR THE RESULT!


So, ok, because people now are posting their code, I send
one of me. I had many different implementations, maybe
one ore the other is more suited to do the task.
The one that I invented some minutes ago looks like this:



let keys_of_hash hash =
     let keys = Hashtbl.create 10000 in (* stupid hard coded value *)
     let sample k v = Hashtbl.replace keys k true in
     Hashtbl.iter sample hash;
     Hashtbl.fold (fun k v init -> k::init) keys []


Because there is no Hashtbl.size in the standard lib,
which gives back the number of over-all bindings in
the hash, I had to insert some stupid value.
(I was to lazy to write it by myself. It also
 does not make sense to walk through the whole
 hash to get the Hashtbl.size and then to allocate
 the keys-.Hash and then To iterate and then to fold...
 ...there will be more efficient solutions than that...
 btw: I don't know how inefficient it would be to use
 Hashtbl.create 1  and then let grow the hash itself...)


OK, it's possible to calculate Hashtbl.size with
Hastbl.iter, but when Hashtbl.size would be part
of the standard Hashtbl-lib, this would make sense.

I hope we can agree on that Hashtbl.size would return
an int?! ;-)

Hashtbl.create needs an int for the initial size, and
doing things like the above stuff, would be more
consistently when there would be a Hashtbl.size
inside the standard Hastbl-module.
a) The return type is clear (isn't it?!)
b) Estimating initial size of hashtables can be predicted in
   similar cases like the above one.

But nevertheless I insist in Hashtbl.keys as a very useful
function.
You then can use Hastbl.find as well as Hashtbl.find_all
for retrieval of all contents, depending on your needs.

Why I think Hashtbl.keys makes more sense than Hashtbl.values?
Because it reflects the sense of a Hashtable, to have direct
access to the values VIA THE KEYS.
And that also is the reason, why I think Hastbl.keys should
necessarily be part of the lib.

Also I think it will be (much) more efficient to do it inside the
Hashtbl-lib instead of doing it with such outside-approaches.
The direct way inside the lib seems to make more sense to me here.


Ciao,
   Oliver

----- End forwarded message -----

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


             reply	other threads:[~2004-04-23 20:20 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-04-23 20:09 Oliver Bandel [this message]
2004-04-24  3:00 ` skaller
2004-04-24  4:46   ` Jon Harrop
2004-04-24 10:39     ` Oliver Bandel
2004-04-25 18:54     ` Marcin 'Qrczak' Kowalczyk
2004-04-25 20:06       ` Jon Harrop
2004-04-26 10:14         ` Richard Jones
2004-04-26 17:13           ` Jon Harrop
2004-04-24  6:42   ` Basile STARYNKEVITCH
2004-04-24 19:12     ` skaller
2004-04-24  8:56   ` Oliver Bandel
2004-04-23 20:09 Oliver Bandel
2004-04-23 20:09 Oliver Bandel
2004-04-23 20:10 Oliver Bandel
2004-04-23 20:10 Oliver Bandel
2004-04-24  7:30 ` Martin Jambon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20040423200923.GA271@first.in-berlin.de \
    --to=oliver@first.in-berlin.de \
    --cc=caml-list@inria.fr \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox