From: Benjamin Ylvisaker <ben8@cs.washington.edu>
To: Julien Signoles <Julien.Signoles@cea.fr>
Cc: caml-list@yquem.inria.fr
Subject: Re: [Caml-list] ocamlgraph predecessors
Date: Wed, 26 Aug 2009 10:35:47 -0400 [thread overview]
Message-ID: <D42E2AE8-E252-4528-8D1C-EEC25120502A@cs.washington.edu> (raw)
In-Reply-To: <4A93F39A.7090704@cea.fr>
On Aug 25, 2009, at 10:22 AM, Julien Signoles wrote:
> Benjamin Ylvisaker a écrit :
>> I have been using ocamlgraph for a while, and have been generally
>> happy with it. I experienced some poor performance with moderately
>> large graphs (10-100k vertices) recently, which led me to look
>> through the source code a little. It seems that doing anything
>> with the predecessors of a vertex, even just getting a list of
>> them, requires scanning through all the vertices in a graph. This
>> seems a little crazy to me. Am I missing something? Is there some
>> kind of work-around that gives reasonable performance for
>> predecessor operations (i.e. not O(|V|)).
>
> Actually, looking at the current implementation, accessing
> predecessors is worse that O(|V|): that is max(O(|V|,O(|E|)).
>
> If you use concrete (imperative directional) graphs, the simpler
> work-around is to use Imperative.Digraph.ConcreteBidirectional as
> suggested by Kevin Cheung. It uses more memory space (at worse the
> double) that standard concrete directional graphs. But accessing
> predecessors is in O(1) amortized instead of max(O(|V|,O(|E|)) and
> removing a vertex is in O(D*ln(D)) where D is the maximal degree of
> the graph instead of O(|V|*ln(|V|)).
>
> If you don't use this functor, other work-arounds have been
> suggested in other posts.
>
> By the way contributing to ocamlgraph by adding
> Imperative.Digraph.AbstractBidirectional (for instance) is still
> possible and welcome :o).
Thanks for your suggestions. I had not noticed the
ConcreteBidirectional module, but it looks like it wouldn't be a drop-
in replacement for me, because it's unlabeled and I need labels.
If anyone is curious, here is the wrapper logic that I ended up adding:
When the user wants an edge from v1 to v2 with label X, two "internal"
edges get created: one from v1 to v2 with label EdgeForward (X) and
one from v2 to v1 with label EdgeBackward (X). These two edges are
considered equivalent by the wrapper code. Additionally, to made edge
removal faster, I added a table that maps an edge to its "pair". All
edge-related wrapper functions can take either of the pair of wrapper
edges, and will use the source or destination vertex of the edge,
depending on the Forward/Backward label. The edge removal wrapper
function gets the mate of the passed in edge by looking it up in the
table, and removes them both.
To make vertex removal faster, I added a Boolean "removed" field to
vertex labels that is set to false on vertex creation. When the
wrapper vertex removal function is called, it removes all the incident
edges and then just sets the vertex's removed flag to true. Vertex
scanning functions clearly need to check the flag to determine whether
an "internal" vertex should actually be considered part of the graph
or not.
If there is a lot of vertex creation and removal in an application,
clearly a lot of "garbage" vertices will end up polluting the graph.
When the amount of garbage gets too large, a new copy of a graph can
be constructed with only the "real" vertices copied over. This is not
a totally transparent operation, because if there are any tables keyed
on vertices or edges, external to the graph itself, they'll get
confused by the copying.
This set of wrapper logic clearly adds a non-trivial amount of memory
overhead: 1) The edge and vertex labels are a little bit bigger. 2)
There are two edges for every externally visible edge. 3) The edge-
pair table adds another O(|E|) chunk of memory. 4) An application-
dependent number of garbage vertices will be floating around. I'm
pretty sure all the wrapper operations are asymptotically as fast as
they reasonably can be, though. If an implementation like this were
done in the library itself (notice the strategic use of the passive
voice) instead of as a wrapper, I'm pretty sure vertex removal could
be handled more cleanly.
Ben
next prev parent reply other threads:[~2009-08-26 14:35 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-08-08 5:24 Benjamin Ylvisaker
2009-08-08 13:35 ` [Caml-list] " Edgar Friendly
2009-08-08 20:16 ` Benjamin Ylvisaker
2009-08-09 14:56 ` Edgar Friendly
2009-08-25 14:22 ` Julien Signoles
2009-08-25 14:22 ` Julien Signoles
2009-08-26 14:35 ` Benjamin Ylvisaker [this message]
2009-08-26 6:54 ` Jean-Christophe Filliâtre
2009-08-09 18:32 rixed
2009-08-10 1:59 ` Jacques Garrigue
2009-08-10 12:19 ` kcheung
2009-08-10 17:46 ` Benjamin Ylvisaker
2009-08-12 22:40 ` Francis Dupont
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=D42E2AE8-E252-4528-8D1C-EEC25120502A@cs.washington.edu \
--to=ben8@cs.washington.edu \
--cc=Julien.Signoles@cea.fr \
--cc=caml-list@yquem.inria.fr \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox