From: Francois Berenger <francois.berenger@inria.fr>
To: caml-list@inria.fr
Subject: Re: [Caml-list] Hash consed Patricia trees
Date: Wed, 25 May 2016 15:20:03 +0200 [thread overview]
Message-ID: <5745A683.2050108@inria.fr> (raw)
In-Reply-To: <CABbVA-Bn7CVFv77r6dMedVKTPR7HJEJ9pSGXJh_PwjEPMnq4gQ@mail.gmail.com>
On 25/05/2016 14:29, Boris Yakobowski wrote:
> Hi,
>
> The Value Analysis plugin of Frama-C uses hash-consing of Patricial
> trees extensively. In fact, some analyses would not run without it at
> all. See Section 9 of
> cristal.inria.fr/~doligez/publications/cuoq-doligez-mlw-2008.ps
> <http://cristal.inria.fr/~doligez/publications/cuoq-doligez-mlw-2008.ps>
> for more details. Unfortunately, as mentioned there, no figures exist
> for with hash-consing vs. without hash-consing -- but most of the
> examples would have failed without it.
>
> Although I'm not sure what was implemented exactly at the time, one
> important feature when using hash-consed Patricia trees is the
> possibility of using caches. Alain mentioned this in this mail:
>
>> Also, you get a nice unique integer for each tree. This allow you to
>> memoize efficiently set operations (like union, intersection, for which
>> you can use memoization in the inner loop, not only at toplevel), and to
>> build sets of sets (and so on).
>
> I should stress that the possibility of memoizing *in the inner loop*,
> is crucial. When performing e.g. unions or map2 operations, it is
> possible to return a result in constant time when either
> - the two trees is equivalent (because e.g. union s s == s)
> - the two trees have already been merged, and the result is in the cache.
> In practice, most operations become O(D ln D), where D is the number of
> differences between the two trees, or even O(1) if the cache is big
> enough and the operations repetitive enough.
>
> If this kind of caching may be useful to you, the files hptmap*.ml* of
> Frama-C provides very nice iterators and abstractions.
It might even be useful to have this data structure in opam provided as
a standalone library.
> HTH,
>
>
> On Mon, May 23, 2016 at 4:33 PM, Neuhaeusser, Martin
> <martin.neuhaeusser@siemens.com <mailto:martin.neuhaeusser@siemens.com>>
> wrote:
>
> Dear all,
>
> during some experiments with integer set implementations, I came
> across a discussion on that list that proposed to use Patricia trees
> and hash consing on the tree nodes' constructors to achieve maximal
> sharing:
> http://caml.inria.fr/pub/ml-archives/caml-list/2008/03/5be97d51e2e8aab16b9e7e369a5a5533.en.html
>
> Is anyone aware of a corresponding implementation that also has a
> performance benefit (or, at least, no negative performance impact)
> compared to standard sets or to non-hash consed Patricia trees? Or
> is anyone aware of a paper on that matter?
>
> Sadly, in all my experiments, the combination of Patricia trees with
> hash consing applied to the terms representing the tree has a
> horrible impact on performance (a slowdown by an order of
> magnitude). After spending some thoughts, this seems to be
> reasonable given the structure of a Patricia tree. In particular, we
> found no way to make significand use of the reflexivity properties
> obtained by hash consing in set operations like subset or union. In
> our benchmarks, the time for constructing hash-consed subtrees
> during set operations outweighs any gains obtained by the "physical
> equality = set equality" property. Or is the whole point in the
> earlier discussion the possibility to use hash consing tags for
> memoization of set operations?
>
> Any hints and comments are highly appreciated. It would really be
> great if some of the participants from the 2008 discussion could
> perhaps share their experience.
>
> Best regards,
> Martin
>
> --
> Caml-list mailing list. Subscription management and archives:
> https://sympa.inria.fr/sympa/arc/caml-list
> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
> Bug reports: http://caml.inria.fr/bin/caml-bugs
>
>
>
>
> --
> Boris
--
Regards,
Francois.
"When in doubt, use more types"
next prev parent reply other threads:[~2016-05-25 13:20 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-05-23 14:33 Neuhaeusser, Martin
2016-05-23 14:49 ` Simon Cruanes
2016-05-25 12:29 ` Boris Yakobowski
2016-05-25 13:20 ` Francois Berenger [this message]
2016-05-25 19:25 ` Boris Yakobowski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5745A683.2050108@inria.fr \
--to=francois.berenger@inria.fr \
--cc=caml-list@inria.fr \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox