From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail2-relais-roc.national.inria.fr (mail2-relais-roc.national.inria.fr [192.134.164.83]) by sympa.inria.fr (Postfix) with ESMTPS id CE6BA7ED11 for ; Sat, 24 Sep 2016 07:10:09 +0200 (CEST) Authentication-Results: mail2-smtp-roc.national.inria.fr; spf=None smtp.pra=yminsky@janestreet.com; spf=Pass smtp.mailfrom=yminsky@janestreet.com; spf=None smtp.helo=postmaster@mxout3.mail.janestreet.com Received-SPF: None (mail2-smtp-roc.national.inria.fr: no sender authenticity information available from domain of yminsky@janestreet.com) identity=pra; client-ip=38.105.200.229; receiver=mail2-smtp-roc.national.inria.fr; envelope-from="yminsky@janestreet.com"; x-sender="yminsky@janestreet.com"; x-conformance=sidf_compatible Received-SPF: Pass (mail2-smtp-roc.national.inria.fr: domain of yminsky@janestreet.com designates 38.105.200.229 as permitted sender) identity=mailfrom; client-ip=38.105.200.229; receiver=mail2-smtp-roc.national.inria.fr; envelope-from="yminsky@janestreet.com"; x-sender="yminsky@janestreet.com"; x-conformance=sidf_compatible; x-record-type="v=spf1" Received-SPF: None (mail2-smtp-roc.national.inria.fr: no sender authenticity information available from domain of postmaster@mxout3.mail.janestreet.com) identity=helo; client-ip=38.105.200.229; receiver=mail2-smtp-roc.national.inria.fr; envelope-from="yminsky@janestreet.com"; x-sender="postmaster@mxout3.mail.janestreet.com"; x-conformance=sidf_compatible IronPort-PHdr: =?us-ascii?q?9a23=3As3XKyxGuglpynndIG6CoJJ1GYnF86YWxBRYc798d?= =?us-ascii?q?s5kLTJ75pMiwAkXT6L1XgUPTWs2DsrQf2rCQ6/GrCTxIoc7Y9itdINoUD15NoP?= =?us-ascii?q?5VtjJjKfbNMVf8Iv/uYn5yN+V5f3ghwUuGN1NIEt31fVzYry76xzcTHhLiKVg9?= =?us-ascii?q?fbytScb6xv663OGq+pDVfx4AxH/kOeszfyONsB7Ju8IKrYxnI6c3gluV8zobM9?= =?us-ascii?q?hRkEZhP1GOhF7Z4dus+NY39i1KuO86sctHTbn+V6s9RL1cSj8hNjZmytfssEzy?= =?us-ascii?q?RBGI4DM5U2MNkQsAVxnA7RfhXYbZsCL8u/FhwiSXIYv9SrViCmfq1LtiVBK90H?= =?us-ascii?q?RPDDU+6myCz5EpgQ=3D=3D?= X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: =?us-ascii?q?A0B1AABjCeZXfeXIaSZdGwEBAQMBAQEJA?= =?us-ascii?q?QEBFgEBAQMBAQEJAQEBgxABAQEBAXV8B40smVSFLoRFiAKCBCSFegKBUwc4FAE?= =?us-ascii?q?BAQEBAQEBAQEBEgEBCRYJSIIyBAEVBYIRAQEEEhEEGQEBExkLAQ8LCw0CAgkdA?= =?us-ascii?q?gIhARIBBQEKEgYTEgkHiA8DFwMLonKBMj4yilZnhEkBAQWENwMKg0IBAQEBAQE?= =?us-ascii?q?BAwEBAQEBAQEBFwgQdoUxg0+BBYJHhQGCWoZvDIFAhnSKFzWGJ4ZJgniCPI0vi?= =?us-ascii?q?FyED4I6Ex6BER6DQYIJY4YkgUABAQE?= X-IPAS-Result: =?us-ascii?q?A0B1AABjCeZXfeXIaSZdGwEBAQMBAQEJAQEBFgEBAQMBAQE?= =?us-ascii?q?JAQEBgxABAQEBAXV8B40smVSFLoRFiAKCBCSFegKBUwc4FAEBAQEBAQEBAQEBE?= =?us-ascii?q?gEBCRYJSIIyBAEVBYIRAQEEEhEEGQEBExkLAQ8LCw0CAgkdAgIhARIBBQEKEgY?= =?us-ascii?q?TEgkHiA8DFwMLonKBMj4yilZnhEkBAQWENwMKg0IBAQEBAQEBAwEBAQEBAQEBF?= =?us-ascii?q?wgQdoUxg0+BBYJHhQGCWoZvDIFAhnSKFzWGJ4ZJgniCPI0viFyED4I6Ex6BER6?= =?us-ascii?q?DQYIJY4YkgUABAQE?= X-IronPort-AV: E=Sophos;i="5.30,385,1470693600"; d="scan'208";a="238086332" Received: from mxout3.mail.janestreet.com ([38.105.200.229]) by mail2-smtp-roc.national.inria.fr with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 24 Sep 2016 07:10:08 +0200 Received: from tot-qpr-mailcore1.delacy.com ([172.27.56.68] helo=tot-qpr-mailcore1) by mxout3.mail.janestreet.com with esmtps (TLSv1:DHE-RSA-AES256-SHA:256) (Exim 4.82) (envelope-from ) id 1bnfDy-0002li-Fu for caml-list@inria.fr; Sat, 24 Sep 2016 01:10:06 -0400 X-JS-Flow: external X-JS-Scanner-attachment: (ok) No attachments Received: by tot-qpr-mailcore1 with JS-mailcore (0.1) (envelope-from ) id BX5gqu-AAAA10-Np; 2016-09-24 01:10:06.438431-04:00 Received: from mail-yw0-f198.google.com ([209.85.161.198]) by mxgoog1.mail.janestreet.com with esmtps (UNKNOWN:AES128-GCM-SHA256:128) (Exim 4.72) (envelope-from ) id 1bnfDy-00077C-B5 for caml-list@inria.fr; Sat, 24 Sep 2016 01:10:06 -0400 Received: by mail-yw0-f198.google.com with SMTP id v2so42244049ywg.3 for ; Fri, 23 Sep 2016 22:10:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=janestreet.com; s=google; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=Zq28+DhEP4HOBCWIMEoIkjLCAkgTndV1fNJ0kuIFIK0=; b=pDhHi4DRjvzDOyaz4N9VVObQE7lTtrKRlF9IImuCPCIxpZfHtizC7DyU6slSXzU0tY Ut3syYY3Tgqpq1mLvXQqgvWuXWKcpXkVb4NApq9XixXNiQVEmJg1tc3dM8UKqDkq4zhy bgsZPrq1DqRkIFc15K9I2IJ0+yhwkMUvohBV0= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=Zq28+DhEP4HOBCWIMEoIkjLCAkgTndV1fNJ0kuIFIK0=; b=CJ8u0MF3gCW9UZIT8w/Ubf1MIDbeB5PtGlj78hPzz3VoIN1dia8B4ZNc69KckBC2tk M0myCtOigRTYicRp5v+zoubIRIEM4jDvAknr8B+P2R+sao6MqKC5wbADP8rPFl+hrj5x fn3XV8FC6qoSMaFBBhMzSqUUq0KGIcpWvICK/imJE14tujsCecLKltV1/6gj/Xc0ymeW aV9GR12XqwS7RhKRYysgiVIZwvPLQ0Ab9kJjacYsE6Lb1M4T4TbvjfE1ST5F8rs1gzIa C/rbggQ01snURmoYqaj4L4F3rM4XhpEMMrJO9So9eawNcklbCi8/npKCAUTSf6TIb2VF jF+Q== X-Gm-Message-State: AE9vXwMrv7ReN17sR3o20ss4OYxPOaDUFgZyPYQWwUOLxV+LuC64BH3B2ff/an7GKACe9YqjLLx5r8O+jfLXKg/SW/I76bhGwFm0OZVTQhIhB+RbA5WUi50zjngh18G0m6gOHH7Lucgg86dasZe4 X-Received: by 10.37.231.199 with SMTP id e190mr8991066ybh.194.1474693805993; Fri, 23 Sep 2016 22:10:05 -0700 (PDT) X-Received: by 10.37.231.199 with SMTP id e190mr8991055ybh.194.1474693805764; Fri, 23 Sep 2016 22:10:05 -0700 (PDT) MIME-Version: 1.0 Received: by 10.37.201.1 with HTTP; Fri, 23 Sep 2016 22:09:45 -0700 (PDT) In-Reply-To: References: <2161155.JggR3X9ZFM@molnar> <2324346.DfHVUap7Qc@molnar> From:Yaron Minsky Date: Sat, 24 Sep 2016 14:09:45 +0900 Message-ID: To:Markus Mottl Cc:Gabriel Scherer , Lukasz Stafiniak , OCaml List Content-Type: text/plain; charset=UTF-8 X-JS-Processed-by: mailcore Subject: Re: [Caml-list] Covariant GADTs This looks like a nice improvement. A PR would be very welcome... y On Thu, Sep 22, 2016 at 9:39 AM, Markus Mottl wrote: > The direct comparison with the Jane Street implementation showed a 40% > speed increase for some random things I tried, but that's not a fair > comparison. If I improve the JS code, e.g. to avoid allocations, the > performance improvement due to the GADT + inlined records drops to > only about 10%. > > In terms of memory, a freshly created set costs 7 machine words in the > original code vs. 5 for the GADT. Adding one rank costs 4 machine > words in the standard implementation vs. only 2 for GADTs. That's a > pretty significant size reduction. The GADT representation would > surely help in programs that allocate a lot of these values, but the > values don't tend to grow much internally due to the tree compression > algorithm. I'm sure there are better examples where a program would > typically allocate GADT-based data structures of more significant > size. > > Regards, > Markus > > On Wed, Sep 21, 2016 at 5:40 PM, Gabriel Scherer > wrote: >> Very nice. Would you have more precise numbers for the "considerably more >> efficient" part? It's not always easy to find clear benefits to inline >> records on representative macro-benchmarks. >> >> On Thu, Sep 22, 2016 at 2:04 AM, Markus Mottl >> wrote: >>> >>> Here is a complete working example of the advantages of using GADTs >>> with inline records. It also uses the [@@unboxed] feature now >>> available with OCaml 4.04 as discussed before here, though it required >>> a little workaround due to an apparent bug in the current beta. >>> >>> The below implementation of the union-find algorithm is considerably >>> more efficient (with the 4.04 beta only) than the Union_find >>> implementation in the Jane Street Core kernel. The problem admittedly >>> lends itself to the GADT + inline record trick. >>> >>> There is actually one advantage to using an intermediate, unboxed GADT >>> tag compared to records with existentially quantified fields (if they >>> were available): functions matching the tag don't require those >>> horrible type annotations for locally abstract types, because the >>> match automatically sets up the scope for you. Having to write "Node >>> foo" instead of just "foo" in some places isn't too bad. Not sure >>> it's possible to have the best of both worlds. >>> >>> ---------- >>> module Union_find = struct >>> (* This does not work yet due to an OCaml 4.04 beta bug >>> type ('a, 'kind) tree = >>> | Root : { mutable value : 'a; mutable rank : int } -> ('a, [ `root ]) >>> tree >>> | Inner : { mutable parent : 'a node } -> ('a, [ `inner ]) tree >>> >>> and 'a node = Node : ('a, _) tree -> 'a node [@@ocaml.unboxed] >>> >>> type 'a t = ('a, [ `inner ]) tree >>> *) >>> >>> type ('a, 'kind, 'parent) tree = >>> | Root : { mutable value : 'a; mutable rank : int } -> >>> ('a, [ `root ], 'parent) tree >>> | Inner : { mutable parent : 'parent } -> ('a, [ `inner ], 'parent) >>> tree >>> >>> type 'a node = Node : ('a, _, 'a node) tree -> 'a node >>> [@@ocaml.unboxed] >>> >>> type 'a t = ('a, [ `inner ], 'a node) tree >>> >>> let create v = Inner { parent = Node (Root { value = v; rank = 0 }) } >>> >>> let rec compress ~repr:(Inner inner as repr) = function >>> | Node (Root _ as root) -> repr, root >>> | Node (Inner next_inner as repr) -> >>> let repr, _ as res = compress ~repr next_inner.parent in >>> inner.parent <- Node repr; >>> res >>> >>> let compress_inner (Inner inner as repr) = compress ~repr inner.parent >>> >>> let get_root (Inner inner) = >>> match inner.parent with >>> | Node (Root _ as root) -> root (* Avoids compression call *) >>> | Node (Inner _ as repr) -> >>> let repr, root = compress_inner repr in >>> inner.parent <- Node repr; >>> root >>> >>> let get t = let Root r = get_root t in r.value >>> >>> let set t x = let Root r = get_root t in r.value <- x >>> >>> let same_class t1 t2 = get_root t1 == get_root t2 >>> >>> let union t1 t2 = >>> let Inner inner1 as repr1, (Root r1 as root1) = compress_inner t1 in >>> let Inner inner2 as repr2, (Root r2 as root2) = compress_inner t2 in >>> if root1 == root2 then () >>> else >>> let n1 = r1.rank in >>> let n2 = r2.rank in >>> if n1 < n2 then inner1.parent <- Node repr2 >>> else begin >>> inner2.parent <- Node repr1; >>> if n1 = n2 then r1.rank <- r1.rank + 1 >>> end >>> end (* Union_find *) >>> ---------- >>> >>> Regards, >>> Markus >>> >>> On Wed, Sep 21, 2016 at 6:14 AM, Lukasz Stafiniak >>> wrote: >>> > On Wed, Sep 21, 2016 at 12:11 PM, Lukasz Stafiniak >>> > wrote: >>> >> >>> >> A simple solution would be to "A-transform" (IIRC the term) accesses >>> > >>> > Sorry, I forgot to define this. I mean rewrite rules like: >>> > [f r.x] ==> [let x = r.x in f x] >>> > where subsequently the existential variable is introduced (unpacked) >>> > at the let-binding level. This corresponds to a single-variant GADT >>> > pattern match. >>> > >>> >> to fields with existential type variables. This would give a more >>> >> narrow scope on the expression level than you suggest, but a >>> >> well-defined one prior to type inference. To broaden the scope you >>> >> would need to let-bind the field access yourself at the appropriate >>> >> level. >>> >>> >>> >>> -- >>> Markus Mottl http://www.ocaml.info markus.mottl@gmail.com >>> >>> -- >>> Caml-list mailing list. Subscription management and archives: >>> https://sympa.inria.fr/sympa/arc/caml-list >>> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners >>> Bug reports: http://caml.inria.fr/bin/caml-bugs >> >> > > > > -- > Markus Mottl http://www.ocaml.info markus.mottl@gmail.com > > -- > Caml-list mailing list. Subscription management and archives: > https://sympa.inria.fr/sympa/arc/caml-list > Beginner's list: http://groups.yahoo.com/group/ocaml_beginners > Bug reports: http://caml.inria.fr/bin/caml-bugs