From: Mauricio Fernandez <mfp@acm.org>
To: caml-list <caml-list@inria.fr>
Subject: Ropes and rope-like functional extensible vectors with O(1) prepend/append.
Date: Sun, 29 Jul 2007 01:33:05 +0200 [thread overview]
Message-ID: <20070728233305.GB28879@tux-chan> (raw)
In the aftermath of the ICFP contest, during which I used Luca de Alfaro's
Vec, I felt like implementing ropes, based on Boehm's paper and the well-known
code included in his conservative garbage collector.
I later realized that the basic implementation strategies ("dense" leaves,
bounded tree height and amortized constant time concatenation of small
strings) could be generalized to build general extensible vectors similar to
Vec.
Such vectors (tentatively named "Vect" until I find a better name) have some
interesting properties:
* lower space overhead than other structures based on balanced trees such as
Vec. The overhead can be adjusted, allowing to make "get" faster at the
expense of "set" and viceversa.
* appending or prepending a small vector to an arbitrarily large one in
amortized constant time
* concat, subarray, insert, remove operations in amortized logarithmic time
* access and modification (get, set) in logarithmic time
The first one is probably the most compelling. Currently, Vec imposes a 6-word
overhead per element. Even after the obvious modification consisting in adding
a new constructor for leaves, the overhead would still be 350%... Vect uses
compact leaves with a configurable number of elements (32-64 seem good
choices, leading to worst-case overheads of 100% and 50% respectively), which
also helps with "get" due to the improved spatial locality.
You can find the code for both Rope and Vect at
http://eigenclass.org/repos/oropes/head/
It is still young and experimental, but it's maybe worth a look. Any feedback
is very welcome.
The documentation can be found under
http://eigenclass.org/repos/oropes/head/doc/
I've spent some time benchmarking it against Vec; you can also find the
code I used and the resulting graphs at the above address.
To summarise how it compares to Vec:
* Vec can be used when persistence is required, but Vect would probably be a
poor choice in this case (until that is fixed using lazy rebuilding, which
doesn't seem too hard), unless rebalancing explicitly before "taking the
snapshot" is an option
* Vect can append/prepend single elements or small vectors very quickly, in
amortized constant time. See
http://eigenclass.org/repos/oropes/head/append.png
* as expected, Vec.set is faster than Vect's in general
http://eigenclass.org/repos/oropes/head/set.png
However, if the vector is balanced explicitly shortly before an update
burst, Vect is somewhat surprisingly faster
http://eigenclass.org/repos/oropes/head/set-balanced.png
This might be attributed to Vect's smaller memory profile and the fact that
it allows better use of the L2 cache, but there seems to be another factor
that I have yet to discover.
* Vect.get is considerably faster than Vec.get
http://eigenclass.org/repos/oropes/head/get.png
The above URL is a darcs repository, so if anybody shoots me a patch I'll be
happy to apply it :)
Regards,
--
Mauricio Fernandez - http://eigenclass.org
next reply other threads:[~2007-07-28 23:33 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-07-28 23:33 Mauricio Fernandez [this message]
2007-07-30 0:46 ` [Caml-list] " Jon Harrop
2007-07-30 11:51 ` Mauricio Fernandez
2007-07-30 13:47 ` Jon Harrop
2007-07-31 23:55 ` Mauricio Fernandez
2007-08-04 10:36 ` Jon Harrop
2007-08-09 22:07 ` Nathaniel Gray
2007-08-21 21:39 ` Luca de Alfaro
2007-08-22 2:54 ` skaller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20070728233305.GB28879@tux-chan \
--to=mfp@acm.org \
--cc=caml-list@inria.fr \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox