From: Markus Mottl <markus.mottl@gmail.com>
To: Gabriel Scherer <gabriel.scherer@gmail.com>
Cc: Lukasz Stafiniak <lukstafi@gmail.com>, OCaml List <caml-list@inria.fr>
Subject: Re: [Caml-list] Covariant GADTs
Date: Wed, 21 Sep 2016 20:39:12 -0400 [thread overview]
Message-ID: <CAP_800oy7Ug9PO7YajRxwH+ZsYthkOSefXEOKYh55eUsfEa-Zw@mail.gmail.com> (raw)
In-Reply-To: <CAPFanBGDV6YJsijYrf8pXDzNwEEyQN32CX0+k2sbvrc56cCsUg@mail.gmail.com>
The direct comparison with the Jane Street implementation showed a 40%
speed increase for some random things I tried, but that's not a fair
comparison. If I improve the JS code, e.g. to avoid allocations, the
performance improvement due to the GADT + inlined records drops to
only about 10%.
In terms of memory, a freshly created set costs 7 machine words in the
original code vs. 5 for the GADT. Adding one rank costs 4 machine
words in the standard implementation vs. only 2 for GADTs. That's a
pretty significant size reduction. The GADT representation would
surely help in programs that allocate a lot of these values, but the
values don't tend to grow much internally due to the tree compression
algorithm. I'm sure there are better examples where a program would
typically allocate GADT-based data structures of more significant
size.
Regards,
Markus
On Wed, Sep 21, 2016 at 5:40 PM, Gabriel Scherer
<gabriel.scherer@gmail.com> wrote:
> Very nice. Would you have more precise numbers for the "considerably more
> efficient" part? It's not always easy to find clear benefits to inline
> records on representative macro-benchmarks.
>
> On Thu, Sep 22, 2016 at 2:04 AM, Markus Mottl <markus.mottl@gmail.com>
> wrote:
>>
>> Here is a complete working example of the advantages of using GADTs
>> with inline records. It also uses the [@@unboxed] feature now
>> available with OCaml 4.04 as discussed before here, though it required
>> a little workaround due to an apparent bug in the current beta.
>>
>> The below implementation of the union-find algorithm is considerably
>> more efficient (with the 4.04 beta only) than the Union_find
>> implementation in the Jane Street Core kernel. The problem admittedly
>> lends itself to the GADT + inline record trick.
>>
>> There is actually one advantage to using an intermediate, unboxed GADT
>> tag compared to records with existentially quantified fields (if they
>> were available): functions matching the tag don't require those
>> horrible type annotations for locally abstract types, because the
>> match automatically sets up the scope for you. Having to write "Node
>> foo" instead of just "foo" in some places isn't too bad. Not sure
>> it's possible to have the best of both worlds.
>>
>> ----------
>> module Union_find = struct
>> (* This does not work yet due to an OCaml 4.04 beta bug
>> type ('a, 'kind) tree =
>> | Root : { mutable value : 'a; mutable rank : int } -> ('a, [ `root ])
>> tree
>> | Inner : { mutable parent : 'a node } -> ('a, [ `inner ]) tree
>>
>> and 'a node = Node : ('a, _) tree -> 'a node [@@ocaml.unboxed]
>>
>> type 'a t = ('a, [ `inner ]) tree
>> *)
>>
>> type ('a, 'kind, 'parent) tree =
>> | Root : { mutable value : 'a; mutable rank : int } ->
>> ('a, [ `root ], 'parent) tree
>> | Inner : { mutable parent : 'parent } -> ('a, [ `inner ], 'parent)
>> tree
>>
>> type 'a node = Node : ('a, _, 'a node) tree -> 'a node
>> [@@ocaml.unboxed]
>>
>> type 'a t = ('a, [ `inner ], 'a node) tree
>>
>> let create v = Inner { parent = Node (Root { value = v; rank = 0 }) }
>>
>> let rec compress ~repr:(Inner inner as repr) = function
>> | Node (Root _ as root) -> repr, root
>> | Node (Inner next_inner as repr) ->
>> let repr, _ as res = compress ~repr next_inner.parent in
>> inner.parent <- Node repr;
>> res
>>
>> let compress_inner (Inner inner as repr) = compress ~repr inner.parent
>>
>> let get_root (Inner inner) =
>> match inner.parent with
>> | Node (Root _ as root) -> root (* Avoids compression call *)
>> | Node (Inner _ as repr) ->
>> let repr, root = compress_inner repr in
>> inner.parent <- Node repr;
>> root
>>
>> let get t = let Root r = get_root t in r.value
>>
>> let set t x = let Root r = get_root t in r.value <- x
>>
>> let same_class t1 t2 = get_root t1 == get_root t2
>>
>> let union t1 t2 =
>> let Inner inner1 as repr1, (Root r1 as root1) = compress_inner t1 in
>> let Inner inner2 as repr2, (Root r2 as root2) = compress_inner t2 in
>> if root1 == root2 then ()
>> else
>> let n1 = r1.rank in
>> let n2 = r2.rank in
>> if n1 < n2 then inner1.parent <- Node repr2
>> else begin
>> inner2.parent <- Node repr1;
>> if n1 = n2 then r1.rank <- r1.rank + 1
>> end
>> end (* Union_find *)
>> ----------
>>
>> Regards,
>> Markus
>>
>> On Wed, Sep 21, 2016 at 6:14 AM, Lukasz Stafiniak <lukstafi@gmail.com>
>> wrote:
>> > On Wed, Sep 21, 2016 at 12:11 PM, Lukasz Stafiniak <lukstafi@gmail.com>
>> > wrote:
>> >>
>> >> A simple solution would be to "A-transform" (IIRC the term) accesses
>> >
>> > Sorry, I forgot to define this. I mean rewrite rules like:
>> > [f r.x] ==> [let x = r.x in f x]
>> > where subsequently the existential variable is introduced (unpacked)
>> > at the let-binding level. This corresponds to a single-variant GADT
>> > pattern match.
>> >
>> >> to fields with existential type variables. This would give a more
>> >> narrow scope on the expression level than you suggest, but a
>> >> well-defined one prior to type inference. To broaden the scope you
>> >> would need to let-bind the field access yourself at the appropriate
>> >> level.
>>
>>
>>
>> --
>> Markus Mottl http://www.ocaml.info markus.mottl@gmail.com
>>
>> --
>> Caml-list mailing list. Subscription management and archives:
>> https://sympa.inria.fr/sympa/arc/caml-list
>> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
>> Bug reports: http://caml.inria.fr/bin/caml-bugs
>
>
--
Markus Mottl http://www.ocaml.info markus.mottl@gmail.com
next prev parent reply other threads:[~2016-09-22 0:39 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-09-17 17:38 Markus Mottl
2016-09-18 8:17 ` Petter A. Urkedal
2016-09-19 1:52 ` Markus Mottl
2016-09-19 8:58 ` octachron
2016-09-19 10:18 ` Mikhail Mandrykin
2016-09-19 13:37 ` Mikhail Mandrykin
2016-09-19 14:46 ` Markus Mottl
2016-09-19 14:53 ` Mikhail Mandrykin
2016-09-19 15:03 ` Markus Mottl
2016-09-20 21:07 ` Markus Mottl
2016-09-21 10:11 ` Lukasz Stafiniak
2016-09-21 10:14 ` Lukasz Stafiniak
2016-09-21 17:04 ` Markus Mottl
2016-09-21 21:40 ` Gabriel Scherer
2016-09-22 0:39 ` Markus Mottl [this message]
2016-09-24 5:09 ` Yaron Minsky
2016-10-04 10:33 ` Jacques Garrigue
2016-09-19 14:39 ` Markus Mottl
2016-09-19 10:05 ` Goswin von Brederlow
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAP_800oy7Ug9PO7YajRxwH+ZsYthkOSefXEOKYh55eUsfEa-Zw@mail.gmail.com \
--to=markus.mottl@gmail.com \
--cc=caml-list@inria.fr \
--cc=gabriel.scherer@gmail.com \
--cc=lukstafi@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox