From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail3-relais-sop.national.inria.fr (mail3-relais-sop.national.inria.fr [192.134.164.104]) by sympa.inria.fr (Postfix) with ESMTPS id 547807FB0A for ; Fri, 28 Nov 2014 22:06:24 +0100 (CET) Received-SPF: None (mail3-smtp-sop.national.inria.fr: no sender authenticity information available from domain of jun.lambda@gmail.com) identity=pra; client-ip=209.85.223.180; receiver=mail3-smtp-sop.national.inria.fr; envelope-from="jun.lambda@gmail.com"; x-sender="jun.lambda@gmail.com"; x-conformance=sidf_compatible Received-SPF: Pass (mail3-smtp-sop.national.inria.fr: domain of jun.lambda@gmail.com designates 209.85.223.180 as permitted sender) identity=mailfrom; client-ip=209.85.223.180; receiver=mail3-smtp-sop.national.inria.fr; envelope-from="jun.lambda@gmail.com"; x-sender="jun.lambda@gmail.com"; x-conformance=sidf_compatible; x-record-type="v=spf1" Received-SPF: None (mail3-smtp-sop.national.inria.fr: no sender authenticity information available from domain of postmaster@mail-ie0-f180.google.com) identity=helo; client-ip=209.85.223.180; receiver=mail3-smtp-sop.national.inria.fr; envelope-from="jun.lambda@gmail.com"; x-sender="postmaster@mail-ie0-f180.google.com"; x-conformance=sidf_compatible X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AvkBABDjeFTRVd+0lGdsb2JhbABbg1dYBMcKhk4CgQMHFgEBAQEBEQEBAQEHCwsJEjCEAwEBBBIRHQEbHQEDDAYFCw0qAgICHwEBEQEFARwGExsHiAgBAxINsCE9MYszgWyDCooSChknDVqFKAEBAQEBAQEBAQEBAQEBAQEBAQETAQUOjjWCHRsHgnWBUwWGA40agU5ngjsHgjKCAYEujUeCTIIGGCiFOD4wAQGCRQEBAQ X-IPAS-Result: AvkBABDjeFTRVd+0lGdsb2JhbABbg1dYBMcKhk4CgQMHFgEBAQEBEQEBAQEHCwsJEjCEAwEBBBIRHQEbHQEDDAYFCw0qAgICHwEBEQEFARwGExsHiAgBAxINsCE9MYszgWyDCooSChknDVqFKAEBAQEBAQEBAQEBAQEBAQEBAQETAQUOjjWCHRsHgnWBUwWGA40agU5ngjsHgjKCAYEujUeCTIIGGCiFOD4wAQGCRQEBAQ X-IronPort-AV: E=Sophos;i="5.07,478,1413237600"; d="diff'?scan'208";a="90960990" Received: from mail-ie0-f180.google.com ([209.85.223.180]) by mail3-smtp-sop.national.inria.fr with ESMTP/TLS/RC4-SHA; 28 Nov 2014 22:06:32 +0100 Received: by mail-ie0-f180.google.com with SMTP id rp18so6637223iec.11 for ; Fri, 28 Nov 2014 13:06:21 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type; bh=k6PqT8x5+kgUDzIUAybJmoXfadh3Kzmvu10XIEFmq30=; b=kctsypLBN+x8VP8n+2AqbdZayR2pKeBmFEY4zkwuUHIAcSHBmNVwOhqFeWMEZnprzm 481aKzkOuGbvVbWTC62hpxTbUSKTRFnt+X8KIBRu+NhsTWoKAdQPhjf90G+2a5Splnfh C2sROGt05S51Y04ySFk4xEQSVJGB8rsbrN1ZS3AjPeHXqmPOFdpUlqPOfIegdU0+3Fva Hny6Vx8Unh1t03x8MkuJbxbsyUy1JN3Rq4s8mvcrWLlnA1PsJHd7ARkDHRJu1+it3ZMG U9oBgmGsnFLhWQ1gpiK7o1u1qnfENxlYyHFBARFrEk/upXoFeGvFVr1Q9TS+jg7HMi0t bk9Q== X-Received: by 10.42.25.144 with SMTP id a16mr37494122icc.66.1417208781575; Fri, 28 Nov 2014 13:06:21 -0800 (PST) MIME-Version: 1.0 Received: by 10.107.128.93 with HTTP; Fri, 28 Nov 2014 13:05:50 -0800 (PST) In-Reply-To: References: <20141128133958.GA6607@tbrk.org> From: Jun Inoue Date: Fri, 28 Nov 2014 22:05:50 +0100 Message-ID: To: Gabriel Scherer Cc: Timothy Bourke , OCaml list , Marc Pouzet Content-Type: multipart/mixed; boundary=20cf303ea582ffae830508f1a132 Subject: Re: [Caml-list] Sundials/ML 2.5.0 --20cf303ea582ffae830508f1a132 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Thank you for sharing this interesting information, Gabriel! We benchmarked the code exclusively with 4.01.0 and didn't know about the performance boost in printf. I just measured the performance with examples/kinsol/serial/kinFerTron_dns.ml, which was one of the examples that had the most pronounced effect. As you suggest, the numbers are a lot closer with 4.02.1. The median wall-clock times of 10 runs were: printf, OCaml 4.02.1: 1.76[s] print_*, OCaml 4.02.1: 1.62[s] printf, OCaml 4.01.0: 2.04[s] print_*, OCaml 4.01.0: 1.70[s] The overhead of using printf is about 8% in 4.02.1, as opposed to about 20% in 4.01.0. So in 4.02 the effect is noticeably smaller, though not unmeasurable. We should note this in the doc in a future release. FYI, the experiment can be reproduced as follows (in bash syntax), using the attached patch (referred to as /tmp/kinFerTron_dns_printf.diff below): # Start at the root of the source tree. opam switch 4.02.1; eval `opam config env` ./configure && make clean all cd examples/kinsol/serial # Measure without modification. make PERF_DATA_POINTS=3D10 kinFerTron_dns.opt.perf ../../utils/crunchperf -m kinFerTron_dns.opt.perf > no-printf-4.02.1-kinFerTron_dns.opt.perf # Apply patch and measure again. patch -p4 < /tmp/kinFerTron_dns_printf.diff make PERF_DATA_POINTS=3D10 kinFerTron_dns.opt.perf ../../utils/crunchperf -m kinFerTron_dns.opt.perf > printf-4.02.1-kinFerTron_dns.opt.perf # Undo changes. patch -p4 -R < /tmp/kinFerTron_dns_printf.diff cd ../../.. # Back at the root of the source tree. opam switch 4.01.0; eval `opam config env` # (Basically the same thing as above) ./configure && make clean all cd examples/kinsol/serial make PERF_DATA_POINTS=3D10 kinFerTron_dns.opt.perf ../../utils/crunchperf -m kinFerTron_dns.opt.perf > no-printf-4.01.0-kinFerTron_dns.opt.perf patch -p4 < /tmp/kinFerTron_dns_printf.diff make PERF_DATA_POINTS=3D10 kinFerTron_dns.opt.perf ../../utils/crunchperf -m kinFerTron_dns.opt.perf > printf-4.01.0-kinFerTron_dns.opt.perf # Summarize results for i in *kinFerTron_dns.opt.perf; do printf "\n[$i]\n"; ../../utils/crunchperf -s $i; done On Fri, Nov 28, 2014 at 6:44 PM, Gabriel Scherer wrote: > Thanks for the significant effort put in documenting the bindings > (and, of course, the cool software and research); your "information > and documentation" page is impressive. > > The page has a very interesting performance comparison of numeric code > partly or fully written in OCaml (using bigarrays of floats) -- and > the not-so-surprising results is that the run times of the OCaml > programs are between 100% and 200% of the run time of the reference C > implementation. > > ( http://inria-parkas.github.io/sundialsml/perf.opt.png ) > > I'm curious about this specific part of the explanation: > >> For instance, some OCaml versions spend a significant fraction of their = time >> in printf, and we were able to lower their ratios by instead using print= _string and print_int. > > The new 4.02 implementation of formats, due to Beno=C3=AEt Vaugon, should > be significantly faster (in my experience they match the performance of t= he > less-readable print_* sequence in most situations). Did you try those > OCaml versions with 4.02? > > On Fri, Nov 28, 2014 at 2:39 PM, Timothy Bourke = wrote: >> We are pleased to announce Sundials/ML, an OCaml interface to the >> Sundials suite of numerical solvers (CVODE, CVODES, IDA, IDAS, KINSOL). >> >> Information and documentation: http://inria-parkas.github.io/sundialsml/ >> Source code (BSD): https://github.com/inria-parkas/sundialsml >> >> opam install sundialsml # (requires Sundials 2.5.0) >> >> We gratefully acknowledge the original authors of Sundials, and the >> support of the ITEA 3 project 11004 MODRIO (Model driven physical >> systems operation), Inria, and the Departement d'Informatique de l'ENS. >> >> Timothy Bourke, Jun Inoue, and Marc Pouzet. >> --=20 Jun Inoue --20cf303ea582ffae830508f1a132 Content-Type: text/plain; charset=US-ASCII; name="kinFerTron_dns_printf.diff" Content-Disposition: attachment; filename="kinFerTron_dns_printf.diff" Content-Transfer-Encoding: base64 X-Attachment-Id: f_i32114560 ZGlmZiAtLWdpdCBhL2V4YW1wbGVzL2tpbnNvbC9zZXJpYWwva2luRmVyVHJv bl9kbnMubWwgYi9leGFtcGxlcy9raW5zb2wvc2VyaWFsL2tpbkZlclRyb25f ZG5zLm1sCmluZGV4IDUyNWNmZmMuLmVkZjRiYjggMTAwNjQ0Ci0tLSBhL2V4 YW1wbGVzL2tpbnNvbC9zZXJpYWwva2luRmVyVHJvbl9kbnMubWwKKysrIGIv ZXhhbXBsZXMva2luc29sL3NlcmlhbC9raW5GZXJUcm9uX2Rucy5tbApAQCAt MTE1LDIxICsxMTUsMTMgQEAgbGV0IHNldF9pbml0aWFsX2d1ZXNzMiB1ZGF0 YSA9CiAKICgqIFByaW50IGZpcnN0IGxpbmVzIG9mIG91dHB1dCAocHJvYmxl bSBkZXNjcmlwdGlvbikgKikKIGxldCBwcmludF9oZWFkZXIgZm5vcm10b2wg c2NzdGVwdG9sID0KLSAgcHJpbnRfc3RyaW5nICJcbkZlcnJhcmlzIGFuZCBU cm9uY29uaSB0ZXN0IHByb2JsZW1cbiI7Ci0gIHByaW50X3N0cmluZyAiVG9s ZXJhbmNlIHBhcmFtZXRlcnM6XG4iOworICBwcmludGYgIlxuRmVycmFyaXMg YW5kIFRyb25jb25pIHRlc3QgcHJvYmxlbVxuIjsKKyAgcHJpbnRmICJUb2xl cmFuY2UgcGFyYW1ldGVyczpcbiI7CiAgIHByaW50ZiAiICBmbm9ybXRvbCAg PSAlMTAuNmdcbiAgc2NzdGVwdG9sID0gJTEwLjZnXG4iIGZub3JtdG9sIHNj c3RlcHRvbAogCiAoKiBQcmludCBzb2x1dGlvbiAqKQogbGV0IHByaW50X291 dHB1dCB1ID0gcHJpbnRmICIgJTguNmcgICU4LjZnXG4iIHUuezB9IHUuezF9 CiAKLWxldCBwcmludF9zdHJpbmdfNWQgcyBpID0KLSAgcHJpbnRfc3RyaW5n IHM7Ci0gIGlmIGkgPCAxMCB0aGVuIHByaW50X3N0cmluZyAiICAgICIKLSAg ZWxzZSBpZiBpIDwgMTAwIHRoZW4gcHJpbnRfc3RyaW5nICIgICAiCi0gIGVs c2UgaWYgaSA8IDEwMDAgdGhlbiBwcmludF9zdHJpbmcgIiAgIgotICBlbHNl IGlmIGkgPCAxMDAwMCB0aGVuIHByaW50X3N0cmluZyAiICI7Ci0gIHByaW50 X2ludCBpCi0KICgqIFByaW50IGZpbmFsIHN0YXRpc3RpY3MgY29udGFpbmVk IGluIGlvcHQgKikKICgqIEZvciBoaWdoIE5VTV9SRVBTLCB0aGUgY29zdCBv ZiBPQ2FtbCBwcmludGYgYmVjb21lcyBpbXBvcnRhbnQhICopCiBsZXQgcHJp bnRfZmluYWxfc3RhdHMga21lbSA9CkBAIC0xMzcsMjEgKzEyOSwyMCBAQCBs ZXQgcHJpbnRfZmluYWxfc3RhdHMga21lbSA9CiAgIGxldCBuZmUgID0gS2lu c29sLmdldF9udW1fZnVuY19ldmFscyBrbWVtIGluCiAgIGxldCBuamUgID0g S2luc29sLkRscy5nZXRfbnVtX2phY19ldmFscyBrbWVtIGluCiAgIGxldCBu ZmVEID0gS2luc29sLkRscy5nZXRfbnVtX2Z1bmNfZXZhbHMga21lbSBpbgot ICBwcmludF9zdHJpbmcgIkZpbmFsIFN0YXRpc3RpY3M6XG4iOwotICBwcmlu dF9zdHJpbmdfNWQgIiAgbm5pID0gIiBubmk7Ci0gIHByaW50X3N0cmluZ181 ZCAiICAgIG5mZSAgPSAiIG5mZTsKLSAgcHJpbnRfc3RyaW5nXzVkICIgXG4g IG5qZSA9ICIgbmplOwotICBwcmludF9zdHJpbmdfNWQgIiAgICBuZmVEID0g IiBuZmVEOwotICBwcmludF9zdHJpbmcgIiBcbiIKKyAgcHJpbnRmICJGaW5h bCBTdGF0aXN0aWNzOlxuIjsKKyAgcHJpbnRmICIgIG5uaSA9ICU1ZCAgICBu ZmUgID0gJTVkIFxuIiBubmkgbmZlOworICBwcmludGYgIiAgbmplID0gJTVk ICAgIG5mZUQgPSAlNWQgXG4iIG5qZSBuZmVECiAKICgqIE1BSU4gUFJPR1JB TSAqKQogbGV0IHNvbHZlX2l0IGttZW0gdSBzIGdsc3RyIG1zZXQgPQogICBw cmludF9uZXdsaW5lICgpOwotICBwcmludF9zdHJpbmcgKGlmIG1zZXQ9PTEg dGhlbiAiRXhhY3QgTmV3dG9uIiBlbHNlICJNb2RpZmllZCBOZXd0b24iKTsK LSAgaWYgbm90IGdsc3RyIHRoZW4gcHJpbnRfbmV3bGluZSAoKSBlbHNlIHBy aW50X3N0cmluZyAiIHdpdGggbGluZSBzZWFyY2hcbiI7CisgIGlmIG1zZXQg PT0gMSB0aGVuIHByaW50ZiAiRXhhY3QgTmV3dG9uIgorICBlbHNlIHByaW50 ZiAiTW9kaWZpZWQgTmV3dG9uIjsKKyAgaWYgbm90IGdsc3RyIHRoZW4gcHJp bnRmICJcbiIKKyAgZWxzZSBwcmludGYgIiB3aXRoIGxpbmUgc2VhcmNoXG4i OwogICBLaW5zb2wuc2V0X21heF9zZXR1cF9jYWxscyBrbWVtIG1zZXQ7CiAg IGlnbm9yZSAoS2luc29sLnNvbHZlIGttZW0gdSBnbHN0ciBzIHMpOwotICBw cmludF9zdHJpbmcgIlNvbHV0aW9uOlxuICBbeDEseDJdID0gIjsKKyAgcHJp bnRmICJTb2x1dGlvbjpcbiAgW3gxLHgyXSA9ICI7CiAgIHByaW50X291dHB1 dCAoTnZlY3Rvci51bndyYXAgdSk7CiAgIHByaW50X2ZpbmFsX3N0YXRzIGtt ZW0KIApAQCAtMTkzLDkgKzE4NCw5IEBAIGxldCBtYWluICgpID0KIAogICAo KiAtLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0gKikKIAotICBwcmludF9z dHJpbmcgIlxuLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t LS0tLS0tXG4iOwotICBwcmludF9zdHJpbmcgIlxuSW5pdGlhbCBndWVzcyBv biBsb3dlciBib3VuZHNcbiI7Ci0gIHByaW50X3N0cmluZyAiICBbeDEseDJd ID0gIjsKKyAgcHJpbnRmICJcbi0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t LS0tLS0tLS0tLS0tLS0tLVxuIjsKKyAgcHJpbnRmICJcbkluaXRpYWwgZ3Vl c3Mgb24gbG93ZXIgYm91bmRzXG4iOworICBwcmludGYgIiAgW3gxLHgyXSA9 ICI7CiAgIHByaW50X291dHB1dCB1MTsKIAogICBSZWFsQXJyYXkuYmxpdCB1 MSB1OwpAQCAtMjE4LDkgKzIwOSw5IEBAIGxldCBtYWluICgpID0KIAogICAo KiAtLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0gKikKIAotICBwcmludF9z dHJpbmcgIlxuLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t LS0tLS0tXG4iOwotICBwcmludF9zdHJpbmcgIlxuSW5pdGlhbCBndWVzcyBp biBtaWRkbGUgb2YgZmVhc2libGUgcmVnaW9uXG4iOwotICBwcmludF9zdHJp bmcgIiAgW3gxLHgyXSA9ICI7CisgIHByaW50ZiAiXG4tLS0tLS0tLS0tLS0t LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS1cbiI7CisgIHByaW50ZiAi XG5Jbml0aWFsIGd1ZXNzIGluIG1pZGRsZSBvZiBmZWFzaWJsZSByZWdpb25c biI7CisgIHByaW50ZiAiICBbeDEseDJdID0gIjsKICAgcHJpbnRfb3V0cHV0 IHUyOwogCiAgIFJlYWxBcnJheS5ibGl0IHUyIHU7Cg== --20cf303ea582ffae830508f1a132--