Mailing list for all users of the OCaml language and system.
 help / color / mirror / Atom feed
* [Caml-list] caml perfomance
@ 2001-09-03  8:16 Anton Moscal
  0 siblings, 0 replies; only message in thread
From: Anton Moscal @ 2001-09-03  8:16 UTC (permalink / raw)
  To: caml; +Cc: Nickolay Kolchin

[-- Attachment #1: Type: text/plain, Size: 2249 bytes --]

Hi!

I & my friend has made some investgations about O'Caml flotaing-point code quality. We 
discovered some strange code examples(all assembler code is for IA32 architecture):

--------------------------------------------------------------------------
1. "let _ = f a in" and "ignore (f a);" generates different code.

for example:

let _ =	ignore (sin 1.0); ()

produces the following:

1	Float2_entry:
2	        subl    $8, %esp
3	.L100:
4	        fld1
5	        subl    $8, %esp
6	        fstpl   0(%esp)
7 	        call    sin
8	        addl    $8, %esp
9	        fstpl   0(%esp)
;-----------------------! floating value boxed before discarding
10	       call    caml_alloc2
12   .L101:
13	       leal    4(%eax), %eax
14	       movl    $2301, -4(%eax)
15	       fldl    0(%esp)
16	       fstpl   (%eax)
;-----------------------------------------------------------------
17	       movl    $1, %eax
18	       addl    $8, %esp
19	       ret



let _ = let _ = sin 1.0 in ()

produces the following (much more better) code:

1	Float1_entry:
2	        subl    $8, %esp
3	.L100:
4	        fld1
5	        subl    $8, %esp
6	        fstpl   0(%esp)
7	        call    sin
8	        addl    $8, %esp
9	        fstpl   0(%esp)
;--- instead discarding value from FP stack O'Caml allocates
; space for result on the main stack, store this value to 
; stack and after all discards it.
10	       movl    $1, %eax
11	       addl    $8, %esp
12	       ret


-- 2. passing FP parameter to stack is less efficient than possible:

Ocaml currently generates this code for pushing float arguments.

fldl <address>
subl $8, %esp
fstpl 0(%esp)

While gcc generates two pushes

pushl <const>
pushl <const>

or

movl %address,%reg1
movl %address+4,%reg2
pushl %reg2
pushl %reg1

3. (less important) Ocaml never generate assembler trigonometric functions: fsin, 
fcos...

Yes they violates ANSI rules, but their usage can improve perfomance of an
application. Write simple C program with sin or cos usage and compile it
with and without "-ffast-math" switch. The difference is about 25%.

example.c:
#include <math.h>

int main()
{
    int i;
    volatile double d=1.0;
    for(i=0;i<10000000;i+=1) d=cos(sin(d));
    return 0;
}

Regards, 
Anton Moscal
Nickolay Kolchin

[-- Attachment #2: programs.tgz --]
[-- Type: application/x-gzip, Size: 715 bytes --]

\x1f▀\b\0й┤▓;\0\x03М≈оo⌡0\x14гsФ╞xй.I'ю6\x06╢╓ыy╥\x1dz╛╙┴8N╠jp\x04нz≤Ж©о─⌡┘┼╢у&р╣СG
DЖ≈gцЦЩ─Еы\x0e┤⌠QA┬╒4█м\x19║4║╫Ёe┌р┬R ф(2:▄Б$²@<Н╤:Ж╣н*─I]╙УS╨ГФъ(╛УЪV╙LЦ═░ё╛aЭ┴\x12JOШ?M\x1eЭOI\x12\x19=┴░Я?\x1ae7▐ЬоЩОI╝А^[╛<0ьЪP▀\x12p─@■╫a╕ЙЦАыэ{Мщ;Ч√ёЬ'╞\x17ЪТ\x10ЪQS\v0!&%╦Ь?\x03╫Ь\x17╥╔╙8лlЭо≈╫Q^[ЧvтEЪ╩═┴ЪХ_ИЪLВ≈\x10шЪ%╝Ъ;\a²Ъ5╞uюфZЦ╧ЭO\x109Ь\x1f▒тХq▓\x12≈Ъоа\aQ2╧ъp╦,2²\aЫgо\x13╔├"\x13╔иП?╨\x02`\x06D≈У©+≥i!9lт~щ°V╕"tS[Uмд
-е%╤~]┼▐+<7\x1aS9  2шлmЯ╗╦чW%≤+\x7f╨*Р╙tЯЪEщk\x150╔НфXЦ≥Ьг1>Д\x7fB	mЗ©≤╦О©ЁЮ1├м╖²ЪУ\x13Ь[U\bМo╚╛ЮЧN≥╗Г\x15Ь╡X\x1a\ryRЁщf╣Ж⌡\x04Р═▐\x1eТе▌≈ZЛ\v≤ Ы┼Е+#▓╙°>╧\x1aЩц╚\x1fОцк╓\@Sэp{$М1j▐тL╥3² `6ыщ2\x06вФ║эьaП\x15\⌡WД√К⌡ЕAА_²\x14\x05WMVК▄\x13X\f\x19'/1>(╡фшш\x18чxТ\x12ш┐╒#шtь6}┴МA▒╣╫Ф%к\x17`эb\rs√╚НIvf╢(8Ь;\bз√\x04/{"2("}Q4(┼З":(╒М\x1e≥ДY╧╟Ш╚
СVA\x16╗╫\x06ж|\x05]\x04йЭ╝`з\p█IDo╕o╧┼\x1dГ\x7f)j=ф\x1amЧONВ\x7f8J\x1fУЪ$N╟кЪГ ╪─0W\x05\x0f⌡Ш\vОUu\x17*√\x15R┼uXО8ъ└╨ь▒ё\x17\x04.B╞i\x14\x16╖cЦwh╪ы╟p8\x1c\x0e┤цАp8\x1c\x0e┤Ц]Я\vЙЛ{║\0(\0\0

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2001-09-03  9:36 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2001-09-03  8:16 [Caml-list] caml perfomance Anton Moscal

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox