Mailing list for all users of the OCaml language and system.
 help / color / mirror / Atom feed
From: Toby Kelsey <toby.kelsey@gmail.com>
To: caml-list <caml-list@yquem.inria.fr>
Subject: [Caml-list] Hashtbl performance
Date: Wed, 27 Jul 2011 22:07:10 +0100	[thread overview]
Message-ID: <4E307DFE.3040502@gmail.com> (raw)

I have written two versions of a small program, one using an (int list) data structure and the other an ((int*int) list); and I tested using Hashtbl, Set and (Jean-Christophe Filliatre's) Trie to cache these elements in each version.
The relative run times of the programs turns out to be:

                 Hashtbl  Set  Trie
(int list)         1      2.1  2.0
((int*int) list)  34.7    2.6  2.1


There is a slight inherent speed difference between the 2 versions but the major effect is the cache type (in fact the caching difference must be larger than these ratios due to non-cache computations). I expected Hashtbl might be a bit slower than more specialised data structures, but the large speed difference with different data structures was unexpected. Presumably the slow-down is due to excessive hash collisions.

I had expected that the generic Hashtbl would be written to give adequate speed for all types of data and when I look at OCaml code on the web Hashtbl is usually used, so it seems most OCaml programmers believe the standard Hashtbl is a reasonable choice for most data types as well.

I haven't tested the Batteries or OCaml-Core hash tables so these may be more consistent, but if not, my question is can you predict how well Hashtbl will work for different types of data and so what to use it with, or it is just not reliable enough for general-purpose caching/memoization?

If anyone wants to look at the code, the (int list) Hashtbl version is at <http://rosettacode.org/wiki/Sokoban#OCaml> and the other versions are in <http://pastebin.com/hGn1AL9L> temporarily. Apart from the caching and the int/int pair changes they should be identical.

Toby

             reply	other threads:[~2011-07-27 21:07 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-07-27 21:07 Toby Kelsey [this message]
2011-07-28  9:48 ` Fabrice Le Fessant
2011-07-28 12:25   ` barnier
2011-07-28 14:18     ` Xavier Leroy
2011-07-28 16:29       ` Toby Kelsey

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4E307DFE.3040502@gmail.com \
    --to=toby.kelsey@gmail.com \
    --cc=caml-list@yquem.inria.fr \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox