Mailing list for all users of the OCaml language and system.
 help / color / mirror / Atom feed
From: Gerd Stolpmann <info@gerd-stolpmann.de>
To: Stefano Zacchiroli <zack@cs.unibo.it>
Cc: caml-list@inria.fr
Subject: Re: [Caml-list] Suggestion about balanced trees in stdlib
Date: Sat, 11 May 2002 17:47:04 +0200	[thread overview]
Message-ID: <20020511174704.A632@ice.gerd-stolpmann.de> (raw)
In-Reply-To: <20020510223110.GB31219@cs.unibo.it>; from zack@cs.unibo.it on Sat, May 11, 2002 at 00:31:10 +0200


On 2002.05.11 00:31 Stefano Zacchiroli wrote:
> On Fri, May 10, 2002 at 06:59:54PM +0200, Gerd Stolpmann wrote:
> > Of course, Set and Map would recur to that more special module.
> > 
> > Balanced_tree would have the representation of Map, i.e. the elements are
> > pairs of keys and attached values. To emulate Set, the value () is used.
> 
> I don't know how the compiler treat unit values wrt optimization or
> such, but using an unit in each node to emulate Set seems to me
> inefficient for a module of the standard library.
> 
> I think, but I haven't thought a lot about it, that the problem can be
> solved using an implementation polymorphic in the leaf type hidden to
> the use that access Set and Map through functors as usual.

type 'a map = Empty | Node of 'a map * key * 'a * 'a map * int

type set = Empty | Node of set * elt * set * int

The difference is that every map node has 6 words, and every set node
consists of 5 words. My suggestion is to prefer the map representation,
and define set = unit map, wasting one word per node, and making sets
20% larger.

The other possibility would be to prefer the set representation, e.g.

type 'a baltree = Empty | Node of 'a baltree * 'a elt * 'a baltree * int

and to reduce map and set with some functor tricks to this representation.
Now sets are not larger, but maps require 8 words of memory, because
you have to use an additional pair 'a elt = ('a, map_elt), i.e. maps are
33% larger. I think this is the worse choice.

BTW, if memory consumption matters, the current representation can be still
optimized by removing the balance factor, and encoding it in the variant
tag:

type set = Empty | Node_0 of set * elt * set | Node_l of <same> | Node_r of <same>

Of course, the readability of the code would suffer, but one can argue that 
the overall performance is more important than code beauty for the standard 
library. However, the point is that it is a matter of discussion what is
acceptable in the stdlib, because there are a lot of views to this central
library. I would prefer to pay an increased memory consumption of 20% for
sets, and to get the possibility to use balanced trees directly for that
costs. Maybe other people have different views on this.

> Anyway the idea seems really good.
> 
> I also suggest to add different kind of visit (i.e. not only iter) like
> {pre,post,in}visit via different functions or specifing the desired
> behaviour when using the functor or both.

Why not export functions that access the subtrees directly:
left_subtree, right_subtree, top_element.

Gerd
-- 
----------------------------------------------------------------------------
Gerd Stolpmann      Telefon: +49 6151 997705 (privat)
Viktoriastr. 45             
64293 Darmstadt     EMail:   gerd@gerd-stolpmann.de
Germany                     
----------------------------------------------------------------------------
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


  parent reply	other threads:[~2002-05-11 15:47 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2002-05-10 16:59 Gerd Stolpmann
2002-05-10 16:56 ` Alexander V.Voinov
2002-05-10 22:31 ` Stefano Zacchiroli
2002-05-10 23:05   ` Alexander V.Voinov
2002-05-11  7:20     ` Stefano Zacchiroli
2002-05-11 15:47   ` Gerd Stolpmann [this message]
2002-05-11 17:18     ` Stefano Zacchiroli
2002-05-11 17:36       ` Dave Mason
2002-05-12  0:03       ` polux moon
2002-05-13  7:23 ` Mattias Waldau

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20020511174704.A632@ice.gerd-stolpmann.de \
    --to=info@gerd-stolpmann.de \
    --cc=caml-list@inria.fr \
    --cc=zack@cs.unibo.it \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox