From: Yaron Minsky <yminsky@janestreet.com>
To: Markus Mottl <markus.mottl@gmail.com>
Cc: Gabriel Scherer <gabriel.scherer@gmail.com>,
Lukasz Stafiniak <lukstafi@gmail.com>,
OCaml List <caml-list@inria.fr>
Subject: Re: [Caml-list] Covariant GADTs
Date: Sat, 24 Sep 2016 14:09:45 +0900 [thread overview]
Message-ID: <CACLX4jQK1Y9LAKto1WxKEh4RP8xyqEvhnjy=Lo1x8ijeKUudWw@mail.gmail.com> (raw)
In-Reply-To: <CAP_800oy7Ug9PO7YajRxwH+ZsYthkOSefXEOKYh55eUsfEa-Zw@mail.gmail.com>
This looks like a nice improvement. A PR would be very welcome...
y
On Thu, Sep 22, 2016 at 9:39 AM, Markus Mottl <markus.mottl@gmail.com> wrote:
> The direct comparison with the Jane Street implementation showed a 40%
> speed increase for some random things I tried, but that's not a fair
> comparison. If I improve the JS code, e.g. to avoid allocations, the
> performance improvement due to the GADT + inlined records drops to
> only about 10%.
>
> In terms of memory, a freshly created set costs 7 machine words in the
> original code vs. 5 for the GADT. Adding one rank costs 4 machine
> words in the standard implementation vs. only 2 for GADTs. That's a
> pretty significant size reduction. The GADT representation would
> surely help in programs that allocate a lot of these values, but the
> values don't tend to grow much internally due to the tree compression
> algorithm. I'm sure there are better examples where a program would
> typically allocate GADT-based data structures of more significant
> size.
>
> Regards,
> Markus
>
> On Wed, Sep 21, 2016 at 5:40 PM, Gabriel Scherer
> <gabriel.scherer@gmail.com> wrote:
>> Very nice. Would you have more precise numbers for the "considerably more
>> efficient" part? It's not always easy to find clear benefits to inline
>> records on representative macro-benchmarks.
>>
>> On Thu, Sep 22, 2016 at 2:04 AM, Markus Mottl <markus.mottl@gmail.com>
>> wrote:
>>>
>>> Here is a complete working example of the advantages of using GADTs
>>> with inline records. It also uses the [@@unboxed] feature now
>>> available with OCaml 4.04 as discussed before here, though it required
>>> a little workaround due to an apparent bug in the current beta.
>>>
>>> The below implementation of the union-find algorithm is considerably
>>> more efficient (with the 4.04 beta only) than the Union_find
>>> implementation in the Jane Street Core kernel. The problem admittedly
>>> lends itself to the GADT + inline record trick.
>>>
>>> There is actually one advantage to using an intermediate, unboxed GADT
>>> tag compared to records with existentially quantified fields (if they
>>> were available): functions matching the tag don't require those
>>> horrible type annotations for locally abstract types, because the
>>> match automatically sets up the scope for you. Having to write "Node
>>> foo" instead of just "foo" in some places isn't too bad. Not sure
>>> it's possible to have the best of both worlds.
>>>
>>> ----------
>>> module Union_find = struct
>>> (* This does not work yet due to an OCaml 4.04 beta bug
>>> type ('a, 'kind) tree =
>>> | Root : { mutable value : 'a; mutable rank : int } -> ('a, [ `root ])
>>> tree
>>> | Inner : { mutable parent : 'a node } -> ('a, [ `inner ]) tree
>>>
>>> and 'a node = Node : ('a, _) tree -> 'a node [@@ocaml.unboxed]
>>>
>>> type 'a t = ('a, [ `inner ]) tree
>>> *)
>>>
>>> type ('a, 'kind, 'parent) tree =
>>> | Root : { mutable value : 'a; mutable rank : int } ->
>>> ('a, [ `root ], 'parent) tree
>>> | Inner : { mutable parent : 'parent } -> ('a, [ `inner ], 'parent)
>>> tree
>>>
>>> type 'a node = Node : ('a, _, 'a node) tree -> 'a node
>>> [@@ocaml.unboxed]
>>>
>>> type 'a t = ('a, [ `inner ], 'a node) tree
>>>
>>> let create v = Inner { parent = Node (Root { value = v; rank = 0 }) }
>>>
>>> let rec compress ~repr:(Inner inner as repr) = function
>>> | Node (Root _ as root) -> repr, root
>>> | Node (Inner next_inner as repr) ->
>>> let repr, _ as res = compress ~repr next_inner.parent in
>>> inner.parent <- Node repr;
>>> res
>>>
>>> let compress_inner (Inner inner as repr) = compress ~repr inner.parent
>>>
>>> let get_root (Inner inner) =
>>> match inner.parent with
>>> | Node (Root _ as root) -> root (* Avoids compression call *)
>>> | Node (Inner _ as repr) ->
>>> let repr, root = compress_inner repr in
>>> inner.parent <- Node repr;
>>> root
>>>
>>> let get t = let Root r = get_root t in r.value
>>>
>>> let set t x = let Root r = get_root t in r.value <- x
>>>
>>> let same_class t1 t2 = get_root t1 == get_root t2
>>>
>>> let union t1 t2 =
>>> let Inner inner1 as repr1, (Root r1 as root1) = compress_inner t1 in
>>> let Inner inner2 as repr2, (Root r2 as root2) = compress_inner t2 in
>>> if root1 == root2 then ()
>>> else
>>> let n1 = r1.rank in
>>> let n2 = r2.rank in
>>> if n1 < n2 then inner1.parent <- Node repr2
>>> else begin
>>> inner2.parent <- Node repr1;
>>> if n1 = n2 then r1.rank <- r1.rank + 1
>>> end
>>> end (* Union_find *)
>>> ----------
>>>
>>> Regards,
>>> Markus
>>>
>>> On Wed, Sep 21, 2016 at 6:14 AM, Lukasz Stafiniak <lukstafi@gmail.com>
>>> wrote:
>>> > On Wed, Sep 21, 2016 at 12:11 PM, Lukasz Stafiniak <lukstafi@gmail.com>
>>> > wrote:
>>> >>
>>> >> A simple solution would be to "A-transform" (IIRC the term) accesses
>>> >
>>> > Sorry, I forgot to define this. I mean rewrite rules like:
>>> > [f r.x] ==> [let x = r.x in f x]
>>> > where subsequently the existential variable is introduced (unpacked)
>>> > at the let-binding level. This corresponds to a single-variant GADT
>>> > pattern match.
>>> >
>>> >> to fields with existential type variables. This would give a more
>>> >> narrow scope on the expression level than you suggest, but a
>>> >> well-defined one prior to type inference. To broaden the scope you
>>> >> would need to let-bind the field access yourself at the appropriate
>>> >> level.
>>>
>>>
>>>
>>> --
>>> Markus Mottl http://www.ocaml.info markus.mottl@gmail.com
>>>
>>> --
>>> Caml-list mailing list. Subscription management and archives:
>>> https://sympa.inria.fr/sympa/arc/caml-list
>>> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
>>> Bug reports: http://caml.inria.fr/bin/caml-bugs
>>
>>
>
>
>
> --
> Markus Mottl http://www.ocaml.info markus.mottl@gmail.com
>
> --
> Caml-list mailing list. Subscription management and archives:
> https://sympa.inria.fr/sympa/arc/caml-list
> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
> Bug reports: http://caml.inria.fr/bin/caml-bugs
next prev parent reply other threads:[~2016-09-24 5:10 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-09-17 17:38 Markus Mottl
2016-09-18 8:17 ` Petter A. Urkedal
2016-09-19 1:52 ` Markus Mottl
2016-09-19 8:58 ` octachron
2016-09-19 10:18 ` Mikhail Mandrykin
2016-09-19 13:37 ` Mikhail Mandrykin
2016-09-19 14:46 ` Markus Mottl
2016-09-19 14:53 ` Mikhail Mandrykin
2016-09-19 15:03 ` Markus Mottl
2016-09-20 21:07 ` Markus Mottl
2016-09-21 10:11 ` Lukasz Stafiniak
2016-09-21 10:14 ` Lukasz Stafiniak
2016-09-21 17:04 ` Markus Mottl
2016-09-21 21:40 ` Gabriel Scherer
2016-09-22 0:39 ` Markus Mottl
2016-09-24 5:09 ` Yaron Minsky [this message]
2016-10-04 10:33 ` Jacques Garrigue
2016-09-19 14:39 ` Markus Mottl
2016-09-19 10:05 ` Goswin von Brederlow
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CACLX4jQK1Y9LAKto1WxKEh4RP8xyqEvhnjy=Lo1x8ijeKUudWw@mail.gmail.com' \
--to=yminsky@janestreet.com \
--cc=caml-list@inria.fr \
--cc=gabriel.scherer@gmail.com \
--cc=lukstafi@gmail.com \
--cc=markus.mottl@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox