* [Caml-list] caml perfomance
@ 2001-09-03 8:16 Anton Moscal
0 siblings, 0 replies; only message in thread
From: Anton Moscal @ 2001-09-03 8:16 UTC (permalink / raw)
To: caml; +Cc: Nickolay Kolchin
[-- Attachment #1: Type: text/plain, Size: 2249 bytes --]
Hi!
I & my friend has made some investgations about O'Caml flotaing-point code quality. We
discovered some strange code examples(all assembler code is for IA32 architecture):
--------------------------------------------------------------------------
1. "let _ = f a in" and "ignore (f a);" generates different code.
for example:
let _ = ignore (sin 1.0); ()
produces the following:
1 Float2_entry:
2 subl $8, %esp
3 .L100:
4 fld1
5 subl $8, %esp
6 fstpl 0(%esp)
7 call sin
8 addl $8, %esp
9 fstpl 0(%esp)
;-----------------------! floating value boxed before discarding
10 call caml_alloc2
12 .L101:
13 leal 4(%eax), %eax
14 movl $2301, -4(%eax)
15 fldl 0(%esp)
16 fstpl (%eax)
;-----------------------------------------------------------------
17 movl $1, %eax
18 addl $8, %esp
19 ret
let _ = let _ = sin 1.0 in ()
produces the following (much more better) code:
1 Float1_entry:
2 subl $8, %esp
3 .L100:
4 fld1
5 subl $8, %esp
6 fstpl 0(%esp)
7 call sin
8 addl $8, %esp
9 fstpl 0(%esp)
;--- instead discarding value from FP stack O'Caml allocates
; space for result on the main stack, store this value to
; stack and after all discards it.
10 movl $1, %eax
11 addl $8, %esp
12 ret
-- 2. passing FP parameter to stack is less efficient than possible:
Ocaml currently generates this code for pushing float arguments.
fldl <address>
subl $8, %esp
fstpl 0(%esp)
While gcc generates two pushes
pushl <const>
pushl <const>
or
movl %address,%reg1
movl %address+4,%reg2
pushl %reg2
pushl %reg1
3. (less important) Ocaml never generate assembler trigonometric functions: fsin,
fcos...
Yes they violates ANSI rules, but their usage can improve perfomance of an
application. Write simple C program with sin or cos usage and compile it
with and without "-ffast-math" switch. The difference is about 25%.
example.c:
#include <math.h>
int main()
{
int i;
volatile double d=1.0;
for(i=0;i<10000000;i+=1) d=cos(sin(d));
return 0;
}
Regards,
Anton Moscal
Nickolay Kolchin
[-- Attachment #2: programs.tgz --]
[-- Type: application/x-gzip, Size: 715 bytes --]
\x1f▀\b\0й┤▓;\0\x03М≈оo⌡0\x14гsФ╞xй.I'ю6\x06╢╓ыy╥\x1dz╛╙┴8N╠jp\x04нz≤Ж©о─⌡┘┼╢у&р╣СG
DЖ≈gцЦЩ─Еы\x0e┤⌠QA┬╒4█м\x19║4║╫Ёe┌р┬R ф(2:▄Б$²@<Н╤:Ж╣н*─I]╙УS╨ГФъ(╛УЪV╙LЦ═░ё╛aЭ┴\x12JOШ?M\x1eЭOI\x12\x19=┴░Я?\x1ae7▐ЬоЩОI╝А^[╛<0ьЪP▀\x12p─@■╫a╕ЙЦАыэ{Мщ;Ч√ёЬ'╞\x17ЪТ\x10ЪQS\v0!&%╦Ь?\x03╫Ь\x17╥╔╙8лlЭо≈╫Q^[ЧvтEЪ╩═┴ЪХ_ИЪLВ≈\x10шЪ%╝Ъ;\a²Ъ5╞uюфZЦ╧ЭO\x109Ь\x1f▒тХq▓\x12≈Ъоа\aQ2╧ъp╦,2²\aЫgо\x13╔├"\x13╔иП?╨\x02`\x06D≈У©+≥i!9lт~щ°V╕"tS[Uмд
-е%╤~]┼▐+<7\x1aS9 2шлmЯ╗╦чW%≤+\x7f╨*Р╙tЯЪEщk\x150╔НфXЦ≥Ьг1>Д\x7fB mЗ©≤╦О©ЁЮ1├м╖²ЪУ\x13Ь[U\bМo╚╛ЮЧN≥╗Г\x15Ь╡X\x1a\ryRЁщf╣Ж⌡\x04Р═▐\x1eТе▌≈ZЛ\v≤ Ы┼Е+#▓╙°>╧\x1aЩц╚\x1fОцк╓\@Sэp{$М1j▐тL╥3² `6ыщ2\x06вФ║эьaП\x15\⌡WД√К⌡ЕAА_²\x14\x05WMVК▄\x13X\f\x19'/1>(╡фшш\x18чxТ\x12ш┐╒#шtь6}┴МA▒╣╫Ф%к\x17`эb\rs√╚НIvf╢(8Ь;\bз√\x04/{"2("}Q4(┼З":(╒М\x1e≥ДY╧╟Ш╚
СVA\x16╗╫\x06ж|\x05]\x04йЭ╝`з\p█IDo╕o╧┼\x1dГ\x7f)j=ф\x1amЧONВ\x7f8J\x1fУЪ$N╟кЪГ ╪─0W\x05\x0f⌡Ш\vОUu\x17*√\x15R┼uXО8ъ└╨ь▒ё\x17\x04.B╞i\x14\x16╖cЦwh╪ы╟p8\x1c\x0e┤цАp8\x1c\x0e┤Ц]Я\vЙЛ{║\0(\0\0
^ permalink raw reply [flat|nested] only message in thread
only message in thread, other threads:[~2001-09-03 9:36 UTC | newest]
Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2001-09-03 8:16 [Caml-list] caml perfomance Anton Moscal
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox