From: "Christoph Höger" <christoph.hoeger@tu-berlin.de>
To: caml-list@inria.fr
Subject: Re: [Caml-list] Closing the performance gap to C
Date: Sat, 17 Dec 2016 14:02:57 +0100 [thread overview]
Message-ID: <adf19464-c995-0e02-48e9-100f0efd26b6@tu-berlin.de> (raw)
In-Reply-To: <7bc766a2-d460-524b-35ca-89609a34b719@tu-berlin.de>
[-- Attachment #1.1.1: Type: text/plain, Size: 2330 bytes --]
Ups. Forgot the actual examples.
Am 17.12.2016 um 14:01 schrieb Christoph Höger:
> Dear all,
>
> find attached two simple runge-kutta iteration schemes. One is written
> in C, the other in OCaml. I compared the runtime of both and gcc (-O2)
> produces an executable that is roughly 30% faster (to be more precise:
> 3.52s vs. 2.63s). That is in itself quite pleasing, I think. I do not
> understand however, what causes this difference. Admittedly, the
> generated assembly looks completely different, but both compilers inline
> all functions and generate one big loop. Ocaml generates a lot more
> scaffolding, but that is to be expected.
>
> There is however an interesting particularity: OCaml generates 6 calls
> to cos, while gcc only needs 3 (and one direct jump). Surprisingly,
> there are also calls to cosh, acos and pretty much any other
> trigonometric function (initialization of constants, maybe?)
>
> However, the true culprit seems to be an excess of instructions between
> the different calls to cos. This is what happens between the first two
> calls to cos:
>
> gcc:
> jmpq 400530 <cos@plt>
> nop
> nopw %cs:0x0(%rax,%rax,1)
>
> sub $0x38,%rsp
> movsd %xmm0,0x10(%rsp)
> movapd %xmm1,%xmm0
> movsd %xmm2,0x18(%rsp)
> movsd %xmm1,0x8(%rsp)
> callq 400530 <cos@plt>
>
> ocamlopt:
>
> callq 401a60 <cos@plt>
> mulsd (%r12),%xmm0
> movsd %xmm0,0x10(%rsp)
> sub $0x10,%r15
> lea 0x25c7b6(%rip),%rax
> cmp (%rax),%r15
> jb 404a8a <dlerror@plt+0x2d0a>
> lea 0x8(%r15),%rax
> movq $0x4fd,-0x8(%rax)
>
> movsd 0x32319(%rip),%xmm1
>
> movapd %xmm1,%xmm2
> mulsd %xmm0,%xmm2
> addsd 0x0(%r13),%xmm2
> movsd %xmm2,(%rax)
> movapd %xmm1,%xmm0
> mulsd (%r12),%xmm0
> addsd (%rbx),%xmm0
> callq 401a60 <cos@plt>
>
>
> Is this caused by some underlying difference in the representation of
> numeric values (i.e. tagged ints) or is it reasonable to attack this
> issue as a hobby experiment?
>
>
> thanks for any advice,
>
> Christoph
>
--
Christoph Höger
Technische Universität Berlin
Fakultät IV - Elektrotechnik und Informatik
Übersetzerbau und Programmiersprachen
Sekr. TEL12-2, Ernst-Reuter-Platz 7, 10587 Berlin
Tel.: +49 (30) 314-24890
E-Mail: christoph.hoeger@tu-berlin.de
[-- Attachment #1.1.2: rk4.c --]
[-- Type: text/plain, Size: 843 bytes --]
#include <stdio.h>
#include <math.h>
double exact(double t) { return sin(t); }
double dy(double t, double y) { return cos(t); }
double rk4_step(double y, double t, double h) {
double k1 = h * dy(t, y);
double k2 = h * dy(t + 0.5 * h, y + 0.5 * k1);
double k3 = h * dy(t + 0.5 * h, y + 0.5 * k2);
double k4 = h * dy(t + h, y + k3);
return y + (k1 + k4)/ 6.0 + (k2+k3) / 3.0;
}
double loop (int steps, double h, int n, double y, double t) {
if (n < steps)
return loop(steps, h, n+1, rk4_step(y,t,h), t+h);
else return y;
}
int main() {
double h = 0.1;
double y = loop(102, h, 1, 1.0, 0.0);
double err = fabs(y - exact(102 * h));
int large = 10000000;
double y2 = loop(large, h, 1, 1.0, 0.0);
printf("%d\n",
(fabs(y2 - (exact(large * h))) < 2. * err));
return 0;
}
[-- Attachment #1.1.3: testrk4.ml --]
[-- Type: text/plain, Size: 653 bytes --]
let y' t y = cos t
let exact t = sin t
let rk4_step y t h =
let k1 = h *. y' t y in
let k2 = h *. y' (t +. 0.5*.h) (y +. 0.5*.k1) in
let k3 = h *. y' (t +. 0.5*.h) (y +. 0.5*.k2) in
let k4 = h *. y' (t +. h) (y +. k3) in
y +. (k1+.k4)/.6.0 +. (k2+.k3)/.3.0
let rec loop steps h n y t =
if n < steps then loop steps h (n+1) (rk4_step y t h) (t +. h) else
y
let _ =
let h = 0.1 in
let y = loop 102 h 1 1.0 0.0 in
let err = abs_float (y -. (exact ((float_of_int 102) *. h))) in
let large = 10000000 in
let y = loop large h 1 1.0 0.0 in
Printf.printf "%b\n"
(abs_float (y -. (exact (float_of_int large) *. h)) < 2. *. err)
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]
next prev parent reply other threads:[~2016-12-17 13:03 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-12-17 13:01 Christoph Höger
2016-12-17 13:02 ` Christoph Höger [this message]
2016-12-19 10:58 ` Soegtrop, Michael
2016-12-19 11:51 ` Gerd Stolpmann
2016-12-19 14:52 ` Soegtrop, Michael
2016-12-19 16:41 ` Gerd Stolpmann
2016-12-19 17:09 ` Frédéric Bour
2016-12-19 17:19 ` Yotam Barnoy
2016-12-21 11:25 ` Alain Frisch
2016-12-21 14:45 ` Yotam Barnoy
2016-12-21 16:06 ` Alain Frisch
2016-12-21 16:31 ` Gerd Stolpmann
2016-12-21 16:39 ` Yotam Barnoy
2016-12-21 16:47 ` Gabriel Scherer
2016-12-21 16:51 ` Yotam Barnoy
2016-12-21 16:56 ` Mark Shinwell
2016-12-21 17:43 ` Alain Frisch
2016-12-22 8:39 ` Mark Shinwell
2016-12-22 17:23 ` Pierre Chambart
2016-12-21 17:35 ` Alain Frisch
2016-12-19 15:48 ` Ivan Gotovchits
2016-12-19 16:44 ` Yotam Barnoy
2016-12-19 16:59 ` Ivan Gotovchits
2016-12-21 9:08 ` Christoph Höger
2016-12-23 12:18 ` Oleg
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=adf19464-c995-0e02-48e9-100f0efd26b6@tu-berlin.de \
--to=christoph.hoeger@tu-berlin.de \
--cc=caml-list@inria.fr \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox