Thank you for sharing this interesting information, Gabriel! We benchmarked the code exclusively with 4.01.0 and didn't know about the performance boost in printf. I just measured the performance with examples/kinsol/serial/kinFerTron_dns.ml, which was one of the examples that had the most pronounced effect. As you suggest, the numbers are a lot closer with 4.02.1. The median wall-clock times of 10 runs were: printf, OCaml 4.02.1: 1.76[s] print_*, OCaml 4.02.1: 1.62[s] printf, OCaml 4.01.0: 2.04[s] print_*, OCaml 4.01.0: 1.70[s] The overhead of using printf is about 8% in 4.02.1, as opposed to about 20% in 4.01.0. So in 4.02 the effect is noticeably smaller, though not unmeasurable. We should note this in the doc in a future release. FYI, the experiment can be reproduced as follows (in bash syntax), using the attached patch (referred to as /tmp/kinFerTron_dns_printf.diff below): # Start at the root of the source tree. opam switch 4.02.1; eval `opam config env` ./configure && make clean all cd examples/kinsol/serial # Measure without modification. make PERF_DATA_POINTS=10 kinFerTron_dns.opt.perf ../../utils/crunchperf -m kinFerTron_dns.opt.perf > no-printf-4.02.1-kinFerTron_dns.opt.perf # Apply patch and measure again. patch -p4 < /tmp/kinFerTron_dns_printf.diff make PERF_DATA_POINTS=10 kinFerTron_dns.opt.perf ../../utils/crunchperf -m kinFerTron_dns.opt.perf > printf-4.02.1-kinFerTron_dns.opt.perf # Undo changes. patch -p4 -R < /tmp/kinFerTron_dns_printf.diff cd ../../.. # Back at the root of the source tree. opam switch 4.01.0; eval `opam config env` # (Basically the same thing as above) ./configure && make clean all cd examples/kinsol/serial make PERF_DATA_POINTS=10 kinFerTron_dns.opt.perf ../../utils/crunchperf -m kinFerTron_dns.opt.perf > no-printf-4.01.0-kinFerTron_dns.opt.perf patch -p4 < /tmp/kinFerTron_dns_printf.diff make PERF_DATA_POINTS=10 kinFerTron_dns.opt.perf ../../utils/crunchperf -m kinFerTron_dns.opt.perf > printf-4.01.0-kinFerTron_dns.opt.perf # Summarize results for i in *kinFerTron_dns.opt.perf; do printf "\n[$i]\n"; ../../utils/crunchperf -s $i; done On Fri, Nov 28, 2014 at 6:44 PM, Gabriel Scherer wrote: > Thanks for the significant effort put in documenting the bindings > (and, of course, the cool software and research); your "information > and documentation" page is impressive. > > The page has a very interesting performance comparison of numeric code > partly or fully written in OCaml (using bigarrays of floats) -- and > the not-so-surprising results is that the run times of the OCaml > programs are between 100% and 200% of the run time of the reference C > implementation. > > ( http://inria-parkas.github.io/sundialsml/perf.opt.png ) > > I'm curious about this specific part of the explanation: > >> For instance, some OCaml versions spend a significant fraction of their time >> in printf, and we were able to lower their ratios by instead using print_string and print_int. > > The new 4.02 implementation of formats, due to BenoƮt Vaugon, should > be significantly faster (in my experience they match the performance of the > less-readable print_* sequence in most situations). Did you try those > OCaml versions with 4.02? > > On Fri, Nov 28, 2014 at 2:39 PM, Timothy Bourke wrote: >> We are pleased to announce Sundials/ML, an OCaml interface to the >> Sundials suite of numerical solvers (CVODE, CVODES, IDA, IDAS, KINSOL). >> >> Information and documentation: http://inria-parkas.github.io/sundialsml/ >> Source code (BSD): https://github.com/inria-parkas/sundialsml >> >> opam install sundialsml # (requires Sundials 2.5.0) >> >> We gratefully acknowledge the original authors of Sundials, and the >> support of the ITEA 3 project 11004 MODRIO (Model driven physical >> systems operation), Inria, and the Departement d'Informatique de l'ENS. >> >> Timothy Bourke, Jun Inoue, and Marc Pouzet. >> -- Jun Inoue