From: Jon Harrop <jon@ffconsultancy.com>
To: "Jean-Christophe Filliâtre" <Jean-Christophe.Filliatre@lri.fr>
Cc: caml-list@yquem.inria.fr
Subject: Re: [Caml-list] Help with simple ocaml memoization problem
Date: Thu, 29 Nov 2007 18:57:43 +0000 [thread overview]
Message-ID: <200711291857.43917.jon@ffconsultancy.com> (raw)
In-Reply-To: <474E7F2B.6090007@lri.fr>
On Thursday 29 November 2007 08:58, Jean-Christophe Filliâtre wrote:
> Jon Harrop wrote:
> > The Map implementation in the OCaml stdlib is also quite inefficient. I
> > did a little benchmark once and discovered that Maps actually waste more
> > space than Hashtbls.
>
> I find it unfair to compare an imperative and a persistence data
> structure for performances.
I agree.
> Of course you are going to use some extra
> space if you need to keep old versions of the data stuctures valid.
> But you are sharing *a lot* among the various versions. So if you are
> manipulating several sets/maps with common ancestors at the same time,
> you are saving memory w.r.t. other data structures such as hash tables.
True, my benchmark was a drop-in replacement with no sharing.
> Of course, if you are using a single data structure, in a linear way,
> then yes a hash table is probably more efficient (provided you have a
> good hash function, which is not always easy to achieve).
>
> Regarding implementation of ocaml maps, I wouldn't say that it is
> inefficient: I did my own benchmarls (on sets, but this is the same
> code) and found that ocaml AVLs are really efficient, on the contrary.
> It usually beats other implementations (e.g. red-black trees from the
> SML stdlib), or even specialized structures such as Patricia trees (when
> keys are integers) on some operations.
I found that by manually unrolling with a Node1 constructor for single-element
nodes you can reduce GC load and increase performance by ~30%.
Perhaps "badly optimized" would have been a better phrase. For example, the
Map implementation in the OCaml stdlib manually inlines the height function
even thought it makes relatively little difference to performance: GC load is
the real killer for most immutable data structures.
--
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/products/?e
next prev parent reply other threads:[~2007-11-29 19:06 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-11-29 3:17 Evan Klitzke
2007-11-29 5:53 ` [Caml-list] " Peng Zang
2007-11-29 6:12 ` Evan Klitzke
2007-11-29 8:16 ` David Allsopp
2007-11-29 8:11 ` Jon Harrop
2007-11-29 8:58 ` Jean-Christophe Filliâtre
2007-11-29 18:57 ` Jon Harrop [this message]
2007-11-29 22:25 ` Jon Harrop
2007-11-30 11:03 ` Jean-Christophe Filliâtre
2007-11-29 8:40 ` Luc Maranget
2007-11-29 8:47 ` Jean-Christophe Filliâtre
2007-12-04 23:49 ` Peng Zang
2007-11-29 8:08 ` Jon Harrop
2007-11-29 15:59 ` Peng Zang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=200711291857.43917.jon@ffconsultancy.com \
--to=jon@ffconsultancy.com \
--cc=Jean-Christophe.Filliatre@lri.fr \
--cc=caml-list@yquem.inria.fr \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox