From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail2-relais-roc.national.inria.fr (mail2-relais-roc.national.inria.fr [192.134.164.83]) by sympa.inria.fr (Postfix) with ESMTPS id 2523980198 for ; Tue, 11 Jul 2017 00:28:19 +0200 (CEST) Authentication-Results: mail2-smtp-roc.national.inria.fr; spf=None smtp.pra=alex.only.d@gmail.com; spf=Pass smtp.mailfrom=alex.only.d@gmail.com; spf=None smtp.helo=postmaster@mail-yw0-f172.google.com Received-SPF: None (mail2-smtp-roc.national.inria.fr: no sender authenticity information available from domain of alex.only.d@gmail.com) identity=pra; client-ip=209.85.161.172; receiver=mail2-smtp-roc.national.inria.fr; envelope-from="alex.only.d@gmail.com"; x-sender="alex.only.d@gmail.com"; x-conformance=sidf_compatible Received-SPF: Pass (mail2-smtp-roc.national.inria.fr: domain of alex.only.d@gmail.com designates 209.85.161.172 as permitted sender) identity=mailfrom; client-ip=209.85.161.172; receiver=mail2-smtp-roc.national.inria.fr; envelope-from="alex.only.d@gmail.com"; x-sender="alex.only.d@gmail.com"; x-conformance=sidf_compatible; x-record-type="v=spf1" Received-SPF: None (mail2-smtp-roc.national.inria.fr: no sender authenticity information available from domain of postmaster@mail-yw0-f172.google.com) identity=helo; client-ip=209.85.161.172; receiver=mail2-smtp-roc.national.inria.fr; envelope-from="alex.only.d@gmail.com"; x-sender="postmaster@mail-yw0-f172.google.com"; x-conformance=sidf_compatible IronPort-PHdr: =?us-ascii?q?9a23=3AJ7itshduL/fSYWlek6qRDM5zlGMj4u6mDksu8pMi?= =?us-ascii?q?zoh2WeGdxcW4Yh7h7PlgxGXEQZ/co6odzbGH7Oa4ASQp2tWoiDg6aptCVhsI24?= =?us-ascii?q?09vjcLJ4q7M3D9N+PgdCcgHc5PBxdP9nC/NlVJSo6lPwWB6nK94iQPFRrhKAF7?= =?us-ascii?q?Ovr6GpLIj8Swyuu+54Dfbx9GiTe5Y75+Ngm6oRnMvcQKnIVuLbo8xAHUqXVSYe?= =?us-ascii?q?RWwm1oJVOXnxni48q74YBu/SdNtf8/7sBMSar1cbg2QrxeFzQmLns65Nb3uhnZ?= =?us-ascii?q?TAuA/WUTX2MLmRdVGQfF7RX6XpDssivms+d2xSeXMdHqQb0yRD+v9LlgRgP2hy?= =?us-ascii?q?gbNj456GDXhdJ2jKJHuxKquhhzz5fJbI2JKPZye6XQctQHS2pcRcZRTzJODZ+g?= =?us-ascii?q?b4UBCOoBOPxXr4j7p1ATqRezCg2hCObpxzRVhHH5wLc63vwjHgHI3AIuEdEAvm?= =?us-ascii?q?nKotrpL6oSTfy5wbPUwTnfc/9b2zHw45XIfBA7pvGMWKp9fNbVyUYxGALKkFWR?= =?us-ascii?q?opHqMTOa0eQNqW+b7/R9Xu+okWEnrx9+oze1yscrjInJgoIUxkrZ+ihiz4Y1IM?= =?us-ascii?q?e3SE9/YdK+DJRQsCSaOpJwT8g/TW9ovyM6xacHuZ69ZCUKyZInxwTea/OdaYSI?= =?us-ascii?q?7AjjWP6eITd5mHJleK+/iA2o/Ue8ze38U9G40VZQoSpFldnMsWoB2ADU6siCUv?= =?us-ascii?q?d98F+h1iqI1wDW8uFEJV47lbbFJJI73rEwkZ8TvVzEHi/1nUX2ja2Wel8j+uiy?= =?us-ascii?q?5OTqZKjtqJyEN4JslA3yLqAjlta8DOk4KAQCQmmW9fmm2LH+/0D1XrNHheAsnK?= =?us-ascii?q?bDqpDVP8Ebq7a5AwBL1oYj7A6yDzK839QZmXkLNVJEeRybg4TwNVHCPfL1Aeml?= =?us-ascii?q?j1SjlzdrwP/GPrn/DZnXMnfDl7Lhca58605a1gUz0chS64xIBrwFOv7+WU/8uM?= =?us-ascii?q?bFAhI4LgC42fvrBddj2o8GXGKAGK6ZMKfcsV+S4eIvJvGBa5URuDnjJPkp/fnu?= =?us-ascii?q?jXk9mV4dZ6WmwIAaaH+9Hvt8IkWZZWDgjcsGEWcPpgY+VvDliEWeUT5PYHa/R7?= =?us-ascii?q?4z6Ss+CIKiFIvDQoGtgKed3CqgBZ1XZmVGCkiWHnvydoWEXe0MaCOILcN7nDwE?= =?us-ascii?q?T+vpd4h09A+nskfVzKZgMOHU+zED/cbiytdd5uDemFc17zMiXOqH1GTYZmV5jm?= =?us-ascii?q?4ZDxI527p4vwQpw1OOwe5/hfhJU9B76PZAUwN8PpnZmb8pQ+vuUx7MK4/aAG2t?= =?us-ascii?q?Rc+rVHRsF98=3D?= X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: =?us-ascii?q?A0AGAQAy/2NZhqyhVdFdHAEBBAEBCgEBF?= =?us-ascii?q?wEBBAEBCgEBg1Q/gRQHnj8GgSiCQpVSLIVLAoMlB0MUAQEBAQEBAQEBAQESAQE?= =?us-ascii?q?BCAsLCCgvgjMkgkIBAgECIwQZARsSCwEDDAYFCwMKAgIJHQICIQEBEQEFAQoSB?= =?us-ascii?q?hMSDYl3AQMIDRCfED+MCoFsGAUBHIMGBYNiChknAwpWgyIBAQEBAQUBAQEBHAI?= =?us-ascii?q?GEnmCHS+CBFKEQIEMglcphH2CYQWHewyJUIVehy47h0iHCUuEboIlPo8/i3mHf?= =?us-ascii?q?xQfgRU2gSsxISNeGoRqH4F6OTYBBIhfAQEB?= X-IPAS-Result: =?us-ascii?q?A0AGAQAy/2NZhqyhVdFdHAEBBAEBCgEBFwEBBAEBCgEBg1Q?= =?us-ascii?q?/gRQHnj8GgSiCQpVSLIVLAoMlB0MUAQEBAQEBAQEBAQESAQEBCAsLCCgvgjMkg?= =?us-ascii?q?kIBAgECIwQZARsSCwEDDAYFCwMKAgIJHQICIQEBEQEFAQoSBhMSDYl3AQMIDRC?= =?us-ascii?q?fED+MCoFsGAUBHIMGBYNiChknAwpWgyIBAQEBAQUBAQEBHAIGEnmCHS+CBFKEQ?= =?us-ascii?q?IEMglcphH2CYQWHewyJUIVehy47h0iHCUuEboIlPo8/i3mHfxQfgRU2gSsxISN?= =?us-ascii?q?eGoRqH4F6OTYBBIhfAQEB?= X-IronPort-AV: E=Sophos;i="5.40,342,1496095200"; d="scan'208";a="282858334" Received: from mail-yw0-f172.google.com ([209.85.161.172]) by mail2-smtp-roc.national.inria.fr with ESMTP/TLS/AES128-GCM-SHA256; 11 Jul 2017 00:28:18 +0200 Received: by mail-yw0-f172.google.com with SMTP id a12so41817638ywh.3 for ; Mon, 10 Jul 2017 15:28:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=jdvrofWuEVg0evqJnzLfd8qmgH6u98EHH9BkeDdCecI=; b=nodmbh8VGDQ9ELsUoIC8uHB2YfOdgBkzRm2cHtc0jfBEDh+CaTM5+z53sqMW7C1omp jWHC4e+WU6q24j5TwXsltGjb/9HL0W1+jAH1TvrPYGS+rkBJXh+Ds6UGUL4HjKRQRnav eo2lXiwhap5h4IyN2kAYpxZfIwA6cGtMRTU+CSeQCvFv0B1Euk6vlFEwfj0RQhAHlLMb qFZgjpqdbXdNUWMft6MBjhULUrXAdjkHVmVkfvrQjvGo6yz/AJW1FEFPC+pBsoxZCVP6 nWVrubXpSU9ethKEKaLQznF3RRcRUj4wLat7qFYXubW/OnyuJy0mKVzUf8DFmpi49IAG 0r/w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=jdvrofWuEVg0evqJnzLfd8qmgH6u98EHH9BkeDdCecI=; b=TVPvsrJ9DHnAFLB4DfO1vWVfs3kFj0dPEk9TNjHlZzRt7tOF063PkQtgugx4sD8R6a LWcshB9vabYTpucA0xEcsbIDj6onP+OmDITjBwnKibCOvujYp8oaSOW5KS/0yho0H3gT QVzK7NP4Agoz7u/aKuUZQ/ow9cVMyoUXZTfGGBH72rof1C89y0iLcQt7VA75B8TV07pp /0qfd1lrwlwVPl0yKVJSGxAxfkfVOKINElgUM/LX6V8vohNXlSktHMIyX6jdBdg7IB5x kW+sYjb5HBPzMoHpRExFMmvopW5gc3C3w0djmuobCHbv/9kcFfiKGMCifMquuGoflKzz SkuQ== X-Gm-Message-State: AIVw113PQ32cciqDtNotWIN5stxTSQ20MFlAEvjntP4p7p/vwu26Kutu nogJ6acymMGFzPFQp1zBlYg97US7Nw== X-Received: by 10.129.6.146 with SMTP id 140mr2482631ywg.120.1499725696732; Mon, 10 Jul 2017 15:28:16 -0700 (PDT) MIME-Version: 1.0 Received: by 10.37.12.130 with HTTP; Mon, 10 Jul 2017 15:28:16 -0700 (PDT) In-Reply-To: References: From: Alexey Egorov Date: Tue, 11 Jul 2017 03:28:16 +0500 Message-ID: To: Ivan Gotovchits Cc: caml users Content-Type: text/plain; charset="UTF-8" Subject: Re: [Caml-list] Optimizing pure-functional streams Hi Ivan, thanks for your analysis! Which version of OCaml you are using? I'm tried on 4.04.1 and trunk (with and without flambda) and still getting the same (poor) performance for recursive version, no matter if curried or uncurried loop is used. 2017-07-11 2:29 GMT+05:00 Ivan Gotovchits : > Hi Alexey, > > The recursion version is slower because you're allocating a tuple on each > recursive call. Try to use a curried form of a function, and you will a > performance that is even better than the for-loop (10% better in my case). > E.g., > > > > let loop high = > let rec loop i = function > | t when i > high -> t > | t when i mod 2 = 0 -> loop (i + 1) (t + i * 2) > | t -> loop (i + 1) t in > loop 1 0 > > > The for-loop version is indeed slower than the C version because of tagging, > c.f., the OCaml implementation: > > movq $1, %rbx > movq $3, %rdi > cmpq %rax, %rdi > jg .L100 > > > .L101: > movq %rdi, %rsi > sarq $1, %rsi > movq %rsi, %rdx > shr movq $1, %rbx > movq $3, %rdi > cmpq %rax, %rdi > jg .L100 > > .L101: > movq %rdi, %rsi > sarq $1, %rsi > moq $63, %rdx > movq %rsi, %rcx > addq %rdx, %rcx > andq $-2, %rcx > subq %rcx, %rsi > leaq 1(%rsi,%rsi), %rsi > cmpq $1, %rsi > jne .L102 > leaq -2(%rbx,%rdi,2), %rbx > .L102: > movq %rdi, %rsi > addq $2, %rdi > cmpq %rax, %rsi > jne .L101 > .L100: > movq %rbx, %rax > ret > > > With the GCC output: > > > testl %edi, %edi > jle .L6 > addl $1, %edi > movl $4, %ecx > movl $1, %edx > xorl %eax, %eax > jmp .L3 > .L5: > leaq (%rax,%rcx), %rsi > testb $1, %dl > cmove %rsi, %rax > addq $2, %rcx > .L3: > addl $1, %edx > cmpl %edi, %edx > jne .L5 > rep ret > .L6: > xorl %eax, %eax > ret > > > Notice this `sar` and `shr` instructions, they are here due to the tagging. > The GCC compiler was also able to leverage the conditional move instruction > `cmove`, though it looks nice I'm not sure how much it actually won. There > is no need to use flambda for the recursive and for-loop as they are already > optimized to the maximum. > > Concerning your stream implementation, you may compare it to the Core > Kernel's Sequence. It is basically the same, so you can play the > benchmarking game :) > > The general way to implement a stream with a zero performance penalty (i.e., > as fast as a for loop) is to use staging. This topic is explored in detail > in the Oleg Kiselyov et alas Strymonad [1] paper. The implementation, as > well as other interesting discussions about streams, are available on his > homepage [2] (there are tons of very interesting stuff there!). > > Best, > Ivan > > [1]: http://okmij.org/ftp/meta-programming/strymonas.pdf > [2]: http://okmij.org/ftp/Streams.html > > > > > > On Mon, Jul 10, 2017 at 4:26 PM, Alexey Egorov > wrote: >> >> Hello, >> >> Michael Snoyman recently posted very interesting article about Haskell >> streams and Rust iterators ( >> https://www.fpcomplete.com/blog/2017/07/iterators-streams-rust-haskell >> ) and I tried to do the same in OCaml. >> >> You can find my code here - >> https://gist.github.com/anonymous/127e9116b362d561c5dfa9cce6b453f3 >> >> There is four different versions: >> >> 1) c_loop.c - plain C loop >> 2) f_loop.ml - for-loop in OCaml >> 3) loop.ml - OCaml loop using recursion >> 4) stream.ml - OCaml loop using streams >> >> Here is the results: >> >> c_loop.c: 0m0.105s >> f_loop.ml: 0m0.184s >> loop.ml: 0m0.207s >> stream.ml: 0m0.605s >> >> It's not suprise that C version is fastest - at least, it uses untagged >> ints. >> What surprises me is that recursion is slower than for-loop - can't >> OCaml compile it down to the same imperative loop? >> >> And it's very dissapointing that stream version is 3 times slower than >> recursion-based. >> The problem, I believe, is that OCaml is unable to optimize >> intermediate stream constructors away (as GHC do). >> >> I tried to add inline annotations, which helps a little - is there >> something I can do to improve it further? >> >> -- >> Caml-list mailing list. Subscription management and archives: >> https://sympa.inria.fr/sympa/arc/caml-list >> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners >> Bug reports: http://caml.inria.fr/bin/caml-bugs > >