* [Caml-list] Compiler bug?
@ 2012-08-06 10:04 Dmitry Bely
2012-08-06 10:11 ` Alain Frisch
0 siblings, 1 reply; 30+ messages in thread
From: Dmitry Bely @ 2012-08-06 10:04 UTC (permalink / raw)
To: caml-list
Trying to catch some very rare floating-point related bug I've
realized that I probably don't understand some basic things about
OCaml execution model. Let's consider the simplest function:
let add1 x = x +. 1.
For x86 it is compiled as
.CODE
ALIGN 4
PUBLIC _camlTest__add1_1008
_camlTest__add1_1008:
sub esp, 8
L100:
mov ebx, eax
L101: mov eax, _caml_young_ptr
sub eax, 12
mov _caml_young_ptr, eax
cmp eax, _caml_young_limit
jb L102
lea eax, [eax+4]
mov DWORD PTR [eax-4],2301
fld1
fadd REAL8 PTR [ebx]
fstp REAL8 PTR [eax]
add esp, 8
ret
L102: call _caml_call_gc
L103: jmp L101
X parameter is passed as a pointer to float value ([eax] which is
immediately saved to [ebx]). But If GC happens (L102), float value can
be moved around the OCaml heap and [ebx] become invalid, right? So the
generated code is just wrong. Am I missing something?
- Dmitry Bely
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [Caml-list] Compiler bug?
2012-08-06 10:04 [Caml-list] Compiler bug? Dmitry Bely
@ 2012-08-06 10:11 ` Alain Frisch
2012-08-06 10:20 ` Dmitry Bely
0 siblings, 1 reply; 30+ messages in thread
From: Alain Frisch @ 2012-08-06 10:11 UTC (permalink / raw)
To: Dmitry Bely; +Cc: caml-list
On 08/06/2012 12:04 PM, Dmitry Bely wrote:
> X parameter is passed as a pointer to float value ([eax] which is
> immediately saved to [ebx]). But If GC happens (L102), float value can
> be moved around the OCaml heap and [ebx] become invalid, right? So the
> generated code is just wrong. Am I missing something?
The GC knows about the location of OCaml values, including those stored
in registers when it is called (the compiler generate tables to retrieve
this information from the GC call site). These values are used as roots
for the GC, and they are updated when the value is moved.
Alain
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [Caml-list] Compiler bug?
2012-08-06 10:11 ` Alain Frisch
@ 2012-08-06 10:20 ` Dmitry Bely
2012-08-06 10:34 ` Alain Frisch
0 siblings, 1 reply; 30+ messages in thread
From: Dmitry Bely @ 2012-08-06 10:20 UTC (permalink / raw)
To: caml-list
On Mon, Aug 6, 2012 at 2:11 PM, Alain Frisch <alain@frisch.fr> wrote:
> On 08/06/2012 12:04 PM, Dmitry Bely wrote:
>>
>> X parameter is passed as a pointer to float value ([eax] which is
>> immediately saved to [ebx]). But If GC happens (L102), float value can
>> be moved around the OCaml heap and [ebx] become invalid, right? So the
>> generated code is just wrong. Am I missing something?
>
>
> The GC knows about the location of OCaml values, including those stored in
> registers when it is called (the compiler generate tables to retrieve this
> information from the GC call site). These values are used as roots for the
> GC, and they are updated when the value is moved.
I always thought that local roots are memory-based. How GC knows that
in this case [ebx] should be updated? (remember, the parameter was
passed in [eax]) Could you point me to the relevant part of Ocaml
compiler sources?
- Dmitry Bely
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [Caml-list] Compiler bug?
2012-08-06 10:20 ` Dmitry Bely
@ 2012-08-06 10:34 ` Alain Frisch
2012-08-06 11:03 ` Dmitry Bely
0 siblings, 1 reply; 30+ messages in thread
From: Alain Frisch @ 2012-08-06 10:34 UTC (permalink / raw)
To: Dmitry Bely; +Cc: caml-list
On 08/06/2012 12:20 PM, Dmitry Bely wrote:
> I always thought that local roots are memory-based. How GC knows that
> in this case [ebx] should be updated? (remember, the parameter was
> passed in [eax]) Could you point me to the relevant part of Ocaml
> compiler sources?
The compiler generates 'frame descriptors', which associate to each
possible GC call site the set of registers and stack slots which hold
values.
Look for caml_frame_descriptors. The logic to scan frame descriptors is
in asmrun/roots.c. The implementation of caml_call_gc (e.g. in amd64.S)
saves/restored machine registers to/from caml_gc_regs.
Alain
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [Caml-list] Compiler bug?
2012-08-06 10:34 ` Alain Frisch
@ 2012-08-06 11:03 ` Dmitry Bely
2012-08-06 11:32 ` Alain Frisch
0 siblings, 1 reply; 30+ messages in thread
From: Dmitry Bely @ 2012-08-06 11:03 UTC (permalink / raw)
To: caml-list
On Mon, Aug 6, 2012 at 2:34 PM, Alain Frisch <alain@frisch.fr> wrote:
> On 08/06/2012 12:20 PM, Dmitry Bely wrote:
>>
>> I always thought that local roots are memory-based. How GC knows that
>> in this case [ebx] should be updated? (remember, the parameter was
>> passed in [eax]) Could you point me to the relevant part of Ocaml
>> compiler sources?
>
>
> The compiler generates 'frame descriptors', which associate to each possible
> GC call site the set of registers and stack slots which hold values.
>
> Look for caml_frame_descriptors. The logic to scan frame descriptors is in
> asmrun/roots.c. The implementation of caml_call_gc (e.g. in amd64.S)
> saves/restored machine registers to/from caml_gc_regs.
Aha. _caml_call_gc is
_caml_call_gc:
; Record lowest stack address and return address
mov eax, [esp]
mov _caml_last_return_address, eax
lea eax, [esp+4]
mov _caml_bottom_of_stack, eax
; Save all regs used by the code generator
L105: push ebp
push edi
push esi
push edx
push ecx
push ebx
push eax
mov _caml_gc_regs, esp
; Call the garbage collector
call _caml_garbage_collection
; Restore all regs used by the code generator
pop eax
pop ebx
pop ecx
pop edx
pop esi
pop edi
pop ebp
; Return to caller
ret
it preserves registers and so it cannot update them, like it does with
memory-based local roots. The initial question if [ebx] contents is
invalidated by GC remains open, at least for me.
- Dmitry Bely
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [Caml-list] Compiler bug?
2012-08-06 11:03 ` Dmitry Bely
@ 2012-08-06 11:32 ` Alain Frisch
2012-08-06 12:16 ` Dmitry Bely
` (2 more replies)
0 siblings, 3 replies; 30+ messages in thread
From: Alain Frisch @ 2012-08-06 11:32 UTC (permalink / raw)
To: Dmitry Bely; +Cc: caml-list
On 08/06/2012 01:03 PM, Dmitry Bely wrote:
> _caml_call_gc:
> ; Record lowest stack address and return address
> mov eax, [esp]
> mov _caml_last_return_address, eax
> lea eax, [esp+4]
> mov _caml_bottom_of_stack, eax
> ; Save all regs used by the code generator
> L105: push ebp
> push edi
> push esi
> push edx
> push ecx
> push ebx
> push eax
> mov _caml_gc_regs, esp
> ; Call the garbage collector
> call _caml_garbage_collection
> ; Restore all regs used by the code generator
> pop eax
> pop ebx
> pop ecx
> pop edx
> pop esi
> pop edi
> pop ebp
> ; Return to caller
> ret
>
> it preserves registers and so it cannot update them
No, this code does not preserves registers. It stores them on the
stack, pass a pointer to them to the C code (in caml_gc_regs), and then
restore their values from the stack. In between, the C code has updated
the values on the stack, so the restored values are not the saved ones.
-- Alain
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [Caml-list] Compiler bug?
2012-08-06 11:32 ` Alain Frisch
@ 2012-08-06 12:16 ` Dmitry Bely
2012-08-07 1:35 ` Cedric Cellier
2012-08-08 16:03 ` Dmitry Bely
2 siblings, 0 replies; 30+ messages in thread
From: Dmitry Bely @ 2012-08-06 12:16 UTC (permalink / raw)
To: caml-list
On Mon, Aug 6, 2012 at 3:32 PM, Alain Frisch <alain@frisch.fr> wrote:
> On 08/06/2012 01:03 PM, Dmitry Bely wrote:
>>
>> _caml_call_gc:
>> ; Record lowest stack address and return address
>> mov eax, [esp]
>> mov _caml_last_return_address, eax
>> lea eax, [esp+4]
>> mov _caml_bottom_of_stack, eax
>> ; Save all regs used by the code generator
>> L105: push ebp
>> push edi
>> push esi
>> push edx
>> push ecx
>> push ebx
>> push eax
>> mov _caml_gc_regs, esp
>> ; Call the garbage collector
>> call _caml_garbage_collection
>> ; Restore all regs used by the code generator
>> pop eax
>> pop ebx
>> pop ecx
>> pop edx
>> pop esi
>> pop edi
>> pop ebp
>> ; Return to caller
>> ret
>>
>> it preserves registers and so it cannot update them
>
>
> No, this code does not preserves registers. It stores them on the stack,
> pass a pointer to them to the C code (in caml_gc_regs), and then restore
> their values from the stack. In between, the C code has updated the values
> on the stack, so the restored values are not the saved ones.
OK, you are right. Now I think I understand better how it works. So
the problem is elsewhere (as expected). Thanks for the help!
- Dmitry Bely
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [Caml-list] Compiler bug?
2012-08-06 11:32 ` Alain Frisch
2012-08-06 12:16 ` Dmitry Bely
@ 2012-08-07 1:35 ` Cedric Cellier
2012-08-08 16:03 ` Dmitry Bely
2 siblings, 0 replies; 30+ messages in thread
From: Cedric Cellier @ 2012-08-07 1:35 UTC (permalink / raw)
To: caml-list
----- Message d'origine -----
> > ; Save all regs used by the code generator
> No, this code does not preserves registers.
So you suggest this comment is a little misleading? :p
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [Caml-list] Compiler bug?
2012-08-06 11:32 ` Alain Frisch
2012-08-06 12:16 ` Dmitry Bely
2012-08-07 1:35 ` Cedric Cellier
@ 2012-08-08 16:03 ` Dmitry Bely
2012-08-08 18:03 ` Alain Frisch
2 siblings, 1 reply; 30+ messages in thread
From: Dmitry Bely @ 2012-08-08 16:03 UTC (permalink / raw)
To: caml-list
On Mon, Aug 6, 2012 at 3:32 PM, Alain Frisch <alain@frisch.fr> wrote:
> On 08/06/2012 01:03 PM, Dmitry Bely wrote:
>>
>> _caml_call_gc:
>> ; Record lowest stack address and return address
>> mov eax, [esp]
>> mov _caml_last_return_address, eax
>> lea eax, [esp+4]
>> mov _caml_bottom_of_stack, eax
>> ; Save all regs used by the code generator
>> L105: push ebp
>> push edi
>> push esi
>> push edx
>> push ecx
>> push ebx
>> push eax
>> mov _caml_gc_regs, esp
>> ; Call the garbage collector
>> call _caml_garbage_collection
>> ; Restore all regs used by the code generator
>> pop eax
>> pop ebx
>> pop ecx
>> pop edx
>> pop esi
>> pop edi
>> pop ebp
>> ; Return to caller
>> ret
>>
>> it preserves registers and so it cannot update them
>
>
> No, this code does not preserves registers. It stores them on the stack,
> pass a pointer to them to the C code (in caml_gc_regs), and then restore
> their values from the stack. In between, the C code has updated the values
> on the stack, so the restored values are not the saved ones.
One more question if you don't mind. Why FPU registers are not saved
before caml_garbage_collection? What guarantees that they are not
destroyed inside?
- Dmitry Bely
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [Caml-list] Compiler bug?
2012-08-08 16:03 ` Dmitry Bely
@ 2012-08-08 18:03 ` Alain Frisch
2012-08-08 18:22 ` Jesper Louis Andersen
0 siblings, 1 reply; 30+ messages in thread
From: Alain Frisch @ 2012-08-08 18:03 UTC (permalink / raw)
To: Dmitry Bely; +Cc: caml-list
On 08/08/2012 06:03 PM, Dmitry Bely wrote:
> One more question if you don't mind. Why FPU registers are not saved
> before caml_garbage_collection? What guarantees that they are not
> destroyed inside?
I'm not sure, someone else would have to reply! I guess that these
registers are supposed to be preserved by the callee in the x64 ABI (and
obviously, they don't hold pointers to OCaml values, so they don't have
to be tracked by the GC).
-- Alain
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [Caml-list] Compiler bug?
2012-08-08 18:03 ` Alain Frisch
@ 2012-08-08 18:22 ` Jesper Louis Andersen
2012-08-08 18:40 ` Dmitry Bely
` (2 more replies)
0 siblings, 3 replies; 30+ messages in thread
From: Jesper Louis Andersen @ 2012-08-08 18:22 UTC (permalink / raw)
To: Alain Frisch; +Cc: Dmitry Bely, caml-list
On Wed, Aug 8, 2012 at 8:03 PM, Alain Frisch <alain@frisch.fr> wrote:
> I'm not sure, someone else would have to reply! I guess that these
> registers are supposed to be preserved by the callee in the x64 ABI (and
> obviously, they don't hold pointers to OCaml values, so they don't have to
> be tracked by the GC).
Also, the GC itself will not be using Floating Point code at all, so
we can save a lot by not saving/restoring these values on the stack.
It is akin to what the Linux kernel does: trap on the first FP
instruction and only then do work to save/restore the FP context - but
here we know a priori we never will access FP. By doing so we can cut
the time to the GC call down - and reap the benefits by having a
faster GC cycle.
--
J.
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [Caml-list] Compiler bug?
2012-08-08 18:22 ` Jesper Louis Andersen
@ 2012-08-08 18:40 ` Dmitry Bely
2012-08-08 19:29 ` Fabrice Le Fessant
2012-08-08 23:34 ` Anil Madhavapeddy
2012-08-09 0:53 ` Francois Berenger
2 siblings, 1 reply; 30+ messages in thread
From: Dmitry Bely @ 2012-08-08 18:40 UTC (permalink / raw)
To: caml-list
On Wed, Aug 8, 2012 at 10:22 PM, Jesper Louis Andersen
<jesper.louis.andersen@gmail.com> wrote:
> On Wed, Aug 8, 2012 at 8:03 PM, Alain Frisch <alain@frisch.fr> wrote:
>
>> I'm not sure, someone else would have to reply! I guess that these
>> registers are supposed to be preserved by the callee in the x64 ABI (and
>> obviously, they don't hold pointers to OCaml values, so they don't have to
>> be tracked by the GC).
Yes, they should not be tracked by GC, but they may hold some FP
values (can they across GC call?). Actually for x64 FPU registers are
saved; see amd64.S/amd64nt.asm. But not for x86. I'm curious why.
> Also, the GC itself will not be using Floating Point code at all, so
> we can save a lot by not saving/restoring these values on the stack.
This is not entirely true; GC also triggers pending signals processing
that can modify FPU state and calls C library fuctions (e.g. malloc)
that can do anything inside.
> It is akin to what the Linux kernel does: trap on the first FP
> instruction and only then do work to save/restore the FP context - but
> here we know a priori we never will access FP. By doing so we can cut
> the time to the GC call down - and reap the benefits by having a
> faster GC cycle.
- Dmitry Bely
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [Caml-list] Compiler bug?
2012-08-08 18:40 ` Dmitry Bely
@ 2012-08-08 19:29 ` Fabrice Le Fessant
0 siblings, 0 replies; 30+ messages in thread
From: Fabrice Le Fessant @ 2012-08-08 19:29 UTC (permalink / raw)
To: Dmitry Bely; +Cc: caml-list
On Wed, Aug 8, 2012 at 8:40 PM, Dmitry Bely <dmitry.bely@gmail.com> wrote:
>>> I'm not sure, someone else would have to reply! I guess that these
>>> registers are supposed to be preserved by the callee in the x64 ABI (and
>>> obviously, they don't hold pointers to OCaml values, so they don't have to
>>> be tracked by the GC).
>
> Yes, they should not be tracked by GC, but they may hold some FP
> values (can they across GC call?). Actually for x64 FPU registers are
> saved; see amd64.S/amd64nt.asm. But not for x86. I'm curious why.
MMX registers are not used by OCaml on x86 (except in the SSE2
branch), the backend still uses the x87 stack. I have no x86 computer
at hand to check, but I suppose that all temporary floating point
values are stored on the C stack by the generated code between
floating point computations, for example when calling C functions
(including the garbage collector).
>> Also, the GC itself will not be using Floating Point code at all, so
>> we can save a lot by not saving/restoring these values on the stack.
>
> This is not entirely true; GC also triggers pending signals processing
> that can modify FPU state and calls C library fuctions (e.g. malloc)
> that can do anything inside.
The kernel is supposed to save all the context before calling signal
handlers, otherwise, you would have to disable signals handling during
all floating point computations, no ?
--Fabrice
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [Caml-list] Compiler bug?
2012-08-08 18:22 ` Jesper Louis Andersen
2012-08-08 18:40 ` Dmitry Bely
@ 2012-08-08 23:34 ` Anil Madhavapeddy
2012-08-09 0:53 ` Francois Berenger
2 siblings, 0 replies; 30+ messages in thread
From: Anil Madhavapeddy @ 2012-08-08 23:34 UTC (permalink / raw)
To: Jesper Louis Andersen
Cc: Alain Frisch, Dmitry Bely, caml-list@inria.fr List, PALI Gabor Janos
On 8 Aug 2012, at 11:22, Jesper Louis Andersen <jesper.louis.andersen@gmail.com> wrote:
> On Wed, Aug 8, 2012 at 8:03 PM, Alain Frisch <alain@frisch.fr> wrote:
>
>> I'm not sure, someone else would have to reply! I guess that these
>> registers are supposed to be preserved by the callee in the x64 ABI (and
>> obviously, they don't hold pointers to OCaml values, so they don't have to
>> be tracked by the GC).
>
> Also, the GC itself will not be using Floating Point code at all, so
> we can save a lot by not saving/restoring these values on the stack.
> It is akin to what the Linux kernel does: trap on the first FP
> instruction and only then do work to save/restore the FP context - but
> here we know a priori we never will access FP. By doing so we can cut
> the time to the GC call down - and reap the benefits by having a
> faster GC cycle.
The GC does use floating point code in places. See byterun/major_gc.c
and the caml_major_collection_slice() function. Gabor Pali is hacking on
a FreeBSD kernel module port of the OCaml runtime [1] and disabled all the
FP use to run without tripping over the trap checks in kFreeBSD [2].
[1] https://github.com/pgj/mirage-kfreebsd
[2] https://github.com/pgj/mirage-kfreebsd/commit/4c1859b88d6da540e7246e493809ff6f38ea344e
-anil
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [Caml-list] Compiler bug?
2012-08-08 18:22 ` Jesper Louis Andersen
2012-08-08 18:40 ` Dmitry Bely
2012-08-08 23:34 ` Anil Madhavapeddy
@ 2012-08-09 0:53 ` Francois Berenger
2 siblings, 0 replies; 30+ messages in thread
From: Francois Berenger @ 2012-08-09 0:53 UTC (permalink / raw)
To: caml-list
On 08/09/2012 03:22 AM, Jesper Louis Andersen wrote:
> On Wed, Aug 8, 2012 at 8:03 PM, Alain Frisch <alain@frisch.fr> wrote:
>
>> I'm not sure, someone else would have to reply! I guess that these
>> registers are supposed to be preserved by the callee in the x64 ABI (and
>> obviously, they don't hold pointers to OCaml values, so they don't have to
>> be tracked by the GC).
>
> Also, the GC itself will not be using Floating Point code at all,
Is it possible to enforce that no Floating Point code
is used in some part of a project (let's say a module or a library)?
I have no use for it, I'm just curious.
> so
> we can save a lot by not saving/restoring these values on the stack.
> It is akin to what the Linux kernel does: trap on the first FP
> instruction and only then do work to save/restore the FP context - but
> here we know a priori we never will access FP. By doing so we can cut
> the time to the GC call down - and reap the benefits by having a
> faster GC cycle.
>
^ permalink raw reply [flat|nested] 30+ messages in thread
* compiler bug?
@ 2006-05-17 23:14 Dan Koppel
2006-05-17 23:33 ` [Caml-list] " John Carr
2006-05-18 17:15 ` Xavier Leroy
0 siblings, 2 replies; 30+ messages in thread
From: Dan Koppel @ 2006-05-17 23:14 UTC (permalink / raw)
To: caml-list
[-- Attachment #1: Type: text/plain, Size: 1470 bytes --]
Hello Caml Group,
I would like to report what I think might be a bug in the Ocaml compiler. But first I wanted to run this by this group in case there's something I'm missing. I have some very simple code that consists of 2 nested loops. Inside the inner loop, is a simple statement. Furthermore, the inner loop is not "tight". Ie. the number of iterations within the inner loop is very large and the number of iterations of the outer loop is very small. I then manually time this. I then change the code by inserting a simple function call between the inner and outer loops. This should have virtually no effect whatsoever. However, when I time this, I get exactly twice the time. This is somewhat inexplicable. I tried tinkering with the "-inline" option for ocamlopt but this had no effect. Below is the actual code (main.ml):
let main () =
let dummy1 = ref 0 in
let dummy2 = ref 0.0 in
for i = 1 to 4 do
for j = 1 to 1000000000 do
dummy1 := !dummy1 + 1;
dummy1 := !dummy1 - 1
done;
dummy2 := Unix.gettimeofday ()
done
let _ = main ()
I compile as follows: ocamlopt unix.cmxa main.ml
and run: ./a.out
Is this in fact a bug of the ocamlopt compiler? Or is there some way currently to make this effect disappear?
Thanks for any help!
Sincerely,
Dan Koppel
---------------------------------
Sneak preview the all-new Yahoo.com. It's not radically different. Just radically better.
[-- Attachment #2: Type: text/html, Size: 1907 bytes --]
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [Caml-list] compiler bug?
2006-05-17 23:14 compiler bug? Dan Koppel
@ 2006-05-17 23:33 ` John Carr
2006-05-18 17:15 ` Xavier Leroy
1 sibling, 0 replies; 30+ messages in thread
From: John Carr @ 2006-05-17 23:33 UTC (permalink / raw)
To: Dan Koppel; +Cc: caml-list
On SPARC the presence of the function call in the outer loop causes
the code generated for the inner loop to change so the dummy1 variable
is stored on the stack instead of in a register. Each loop iteration
loads dummy1, modifies it, and stores it back onto the stack.
The store-load hazard, loading a value that is in the store buffer,
adds a large delay. The loop runs in half the time if I comment out
either the store or the load in the assembly.
If the inner loop did more computation the effect would be much less.
This is surprising but not strictly a bug. Xavier Leroy has posted
about similar minor changes causing the compiler to box or unbox a
floating point value with major changes in performance.
> I would like to report what I think might be a bug in the Ocaml compiler. But first I wanted to run this by this group in case there's something I'm missing. I have some very simple code that consists of 2 nested loops. Inside the inner loop, is a simple statement. Furthermore, the inner loop is not "tight". Ie. the number of iterations within the inner loop is very large and the number of iterations of the outer loop is very small. I then manually time this. I then change the code by inserting a simple function call between the inner and outer loops. This should have virtually no effect whatsoever. However, when I time this, I get exactly twice the time. This is somewhat inexplicable. I tried tinkering with the "-inline" option for ocamlopt but this had no effect. Below is the actual code (main.ml):
>
> let main () =
> let dummy1 = ref 0 in
> let dummy2 = ref 0.0 in
> for i = 1 to 4 do
> for j = 1 to 1000000000 do
> dummy1 := !dummy1 + 1;
> dummy1 := !dummy1 - 1
> done;
> dummy2 := Unix.gettimeofday ()
> done
>
> let _ = main ()
>
> I compile as follows: ocamlopt unix.cmxa main.ml
> and run: ./a.out
>
> Is this in fact a bug of the ocamlopt compiler? Or is there some way currently to make this effect disappear?
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [Caml-list] compiler bug?
2006-05-17 23:14 compiler bug? Dan Koppel
2006-05-17 23:33 ` [Caml-list] " John Carr
@ 2006-05-18 17:15 ` Xavier Leroy
2006-05-18 17:34 ` Jacques Carette
1 sibling, 1 reply; 30+ messages in thread
From: Xavier Leroy @ 2006-05-18 17:15 UTC (permalink / raw)
To: Dan Koppel; +Cc: caml-list
> I would like to report what I think might be a bug in the Ocaml
> compiler. But first I wanted to run this by this group in case there's
> something I'm missing. I have some very simple code that consists of 2
> nested loops. Inside the inner loop, is a simple statement.
> Furthermore, the inner loop is not "tight". Ie. the number of
> iterations within the inner loop is very large and the number of
> iterations of the outer loop is very small. I then manually time this.
> I then change the code by inserting a simple function call between the
> inner and outer loops. This should have virtually no effect
> whatsoever. However, when I time this, I get exactly twice the time.
> This is somewhat inexplicable.
As John Carr mentioned, the function call forces the variables that
are live across the call (i.e. dummy1) to be spilled, causing one additional
memory read and one additional write at each iteration of the inner
loop. Since your inner loop is approximately 4 integer instructions,
the additional memory traffic can easily halve performance.
I grant you that spill code generation could be more clever here and
spill/reload around the whole inner loop. But, no, it's not a bug:
Ocaml has a bug when the generated code crashes or produces incorrect
results.
> I learned in basic compiler theory that when a function is called, you
> save all the registers before entering the function.
That's an over-simplification. First, many compilers designate a
number of registers as "callee-save", meaning that all functions must
preserve their values. These would be preserved across a call.
It happens that ocamlopt does not implement callee-save registers, as
they make garbage collection (more exactly, stack traversal) and
exception handling more expensive. But even when a function call
destroys all registers, as with ocamlopt, there are many possible
placements for spills (saving a register in the stack) and reloads (of
a register from the stack), and the placement you suggest (only around
calls) is rarely optimal. Well, in your example it is :-)
Generation of spill code is a dark corner of compiler construction.
As with many other compilation problems, there are a bunch of
NP-completeness results. Unlike many other compilation problems, I'm
not aware of any published heuristics that works well in most cases.
Well, there is the paper by George and Appel where they use an ILP
solver (integer linear programming) to produce optimal spills, but
that's kind of expensive...
- Xavier Leroy
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [Caml-list] compiler bug?
2006-05-18 17:15 ` Xavier Leroy
@ 2006-05-18 17:34 ` Jacques Carette
2006-05-18 17:46 ` Xavier Leroy
2006-05-18 18:19 ` skaller
0 siblings, 2 replies; 30+ messages in thread
From: Jacques Carette @ 2006-05-18 17:34 UTC (permalink / raw)
To: caml-list
Xavier Leroy wrote:
>Generation of spill code is a dark corner of compiler construction.
>As with many other compilation problems, there are a bunch of
>NP-completeness results. Unlike many other compilation problems, I'm
>not aware of any published heuristics that works well in most cases.
>Well, there is the paper by George and Appel where they use an ILP
>solver (integer linear programming) to produce optimal spills, but
>that's kind of expensive...
>
>
It is my impression that users of compilers are "ready" for the
following situation:
1) an optimizing compiler (like ocamlopt!) that produces good code
efficiently
2) a super-optimizing compiler that produces fantastic code, at whatever
cost.
Such a compiler would probably rapidly find a niche of fervent users.
Jacques
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [Caml-list] compiler bug?
2006-05-18 17:34 ` Jacques Carette
@ 2006-05-18 17:46 ` Xavier Leroy
2006-05-18 19:31 ` Jacques Carette
2006-05-18 18:19 ` skaller
1 sibling, 1 reply; 30+ messages in thread
From: Xavier Leroy @ 2006-05-18 17:46 UTC (permalink / raw)
To: Jacques Carette; +Cc: caml-list
>> Well, there is the paper by George and Appel where they use an ILP
>> solver (integer linear programming) to produce optimal spills, but
>> that's kind of expensive...
>
> It is my impression that users of compilers are "ready" for the
> following situation:
> 1) an optimizing compiler (like ocamlopt!) that produces good code
> efficiently
> 2) a super-optimizing compiler that produces fantastic code, at whatever
> cost.
Clearly, you're not the guy who would have to support both compilers :-)
More seriously, I agree that optional "super-optimization" passes
could be useful in some cases.
Actually, George and Appel found that compilation times with their
approach were almost reasonable (e.g. a few minutes instead of a few
seconds for a standard compiler), but they had to use a commercial ILP
solver. If only there were *really good* ILP and SAT solvers under free
licenses...
- Xavier Leroy
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [Caml-list] compiler bug?
2006-05-18 17:46 ` Xavier Leroy
@ 2006-05-18 19:31 ` Jacques Carette
2006-05-18 20:07 ` David Brown
0 siblings, 1 reply; 30+ messages in thread
From: Jacques Carette @ 2006-05-18 19:31 UTC (permalink / raw)
To: Xavier Leroy; +Cc: caml-list
Xavier Leroy wrote:
>Clearly, you're not the guy who would have to support both compilers :-)
>
>
Clearly :-). [Although I've done my time maintaining ancient software
used by ~2 million users, so you only get so much sympathy from me ;-) ]
>Actually, George and Appel found that compilation times with their
>approach were almost reasonable (e.g. a few minutes instead of a few
>seconds for a standard compiler), but they had to use a commercial ILP
>solver. If only there were *really good* ILP and SAT solvers under free
>licenses...
>
>
In Computer Algebra, people use Groebner bases all the time. They have
doubly-exponential worst-case complexity -- but seem to work rather well
in practice. So I have stopped paying attention to worst-case; average
case, when available, does matter a lot more.
For ILP, I have found
http://www-unix.mcs.anl.gov/otc/Guide/faq/linear-programming-faq.html#Q2
to be quite informative about current sources of "free" ILP solvers. Of
particular interest are:
GLPK: http://www.gnu.org/software/glpk/glpk.html
lp_solve: http://groups.yahoo.com/group/lp_solve/
For SAT, things are weirder. Of course there is
http://www.satlib.org/
as well as SVC
http://chicory.stanford.edu/SVC/
and CVC Lite
http://www.cs.nyu.edu/acsys/cvcl/
(these last 2 for SMT rather than pure SAT) which are under compatible
licenses. But at
http://www.qbflib.org/
and
http://www.satlive.org/
there are a number of additional candidates.
Of course, getting an agreement with SRI to use Yices
(http://fm.csl.sri.com/yices/) would go a long way towards satisfying
the "really good" requirement...
Jacques
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [Caml-list] compiler bug?
2006-05-18 19:31 ` Jacques Carette
@ 2006-05-18 20:07 ` David Brown
2006-05-18 20:15 ` Jacques Carette
2006-05-18 20:20 ` Alain Frisch
0 siblings, 2 replies; 30+ messages in thread
From: David Brown @ 2006-05-18 20:07 UTC (permalink / raw)
To: Jacques Carette; +Cc: Xavier Leroy, caml-list
On Thu, May 18, 2006 at 03:31:26PM -0400, Jacques Carette wrote:
> In Computer Algebra, people use Groebner bases all the time. They have
> doubly-exponential worst-case complexity -- but seem to work rather well
> in practice. So I have stopped paying attention to worst-case; average
> case, when available, does matter a lot more.
Except when someone made a decision like this, in say a revision control
system, and suddenly you discover that you've provoked a worst case
scenario, and it suddenly takes hundreds of cpu hours to check in a file
rather than a few seconds.
For something like a compiler, worse case behavior is very important. It
is not generally acceptable for a build to just hang in the compiler.
Dave
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [Caml-list] compiler bug?
2006-05-18 20:07 ` David Brown
@ 2006-05-18 20:15 ` Jacques Carette
2006-05-18 20:20 ` Alain Frisch
1 sibling, 0 replies; 30+ messages in thread
From: Jacques Carette @ 2006-05-18 20:15 UTC (permalink / raw)
To: David Brown; +Cc: caml-list
David Brown wrote:
>>In Computer Algebra, people use Groebner bases all the time. They have
>>doubly-exponential worst-case complexity -- but seem to work rather well
>>in practice. So I have stopped paying attention to worst-case; average
>>case, when available, does matter a lot more.
>>
>>
>
>Except when someone made a decision like this, in say a revision control
>system, and suddenly you discover that you've provoked a worst case
>scenario, and it suddenly takes hundreds of cpu hours to check in a file
>rather than a few seconds.
>
>
I see you've used ClearCase! Rather unpleasant experience.
>For something like a compiler, worse case behavior is very important. It
>is not generally acceptable for a build to just hang in the compiler.
>
Not the compiler, the super-optimizing pass in the compiler. I
completely agree that the base compiler should not hang a build, and
that complexity (including worst-case) there matters a lot.
Speaking of which, have you ever tried to compile an Ocaml program with
a LOT of functors in it? It requires quite a bit of patience...
Jacques
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [Caml-list] compiler bug?
2006-05-18 20:07 ` David Brown
2006-05-18 20:15 ` Jacques Carette
@ 2006-05-18 20:20 ` Alain Frisch
1 sibling, 0 replies; 30+ messages in thread
From: Alain Frisch @ 2006-05-18 20:20 UTC (permalink / raw)
To: David Brown; +Cc: caml-list
David Brown wrote:
> For something like a compiler, worse case behavior is very important.
All the decent programming languages I'm using nowadays have a type
system whose worse case complexity is exponential in the size of the
source. No compiler can be better than exponential. Do you think I
should stop using these languages?
-- Alain
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [Caml-list] compiler bug?
2006-05-18 17:34 ` Jacques Carette
2006-05-18 17:46 ` Xavier Leroy
@ 2006-05-18 18:19 ` skaller
2006-05-18 18:53 ` Jacques Carette
1 sibling, 1 reply; 30+ messages in thread
From: skaller @ 2006-05-18 18:19 UTC (permalink / raw)
To: Jacques Carette; +Cc: caml-list
On Thu, 2006-05-18 at 13:34 -0400, Jacques Carette wrote:
> It is my impression that users of compilers are "ready" for the
> following situation:
> 1) an optimizing compiler (like ocamlopt!) that produces good code
> efficiently
> 2) a super-optimizing compiler that produces fantastic code, at whatever
> cost.
>
> Such a compiler would probably rapidly find a niche of fervent users.
What about high level optimisations?
Felix supports this:
reduce revrev[t] (x:list[t]): rev (rev x) => x;
which, combined with inlining, removes adjacent list reversals.
This is a fairly trivial example of integrating logic with
programming as a way of achieving both correctness and
performance: the reduction above provides both semantic
knowledge to the reader as well as allowing the compiler
to generate better code.
--
John Skaller <skaller at users dot sf dot net>
Felix, successor to C++: http://felix.sf.net
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [Caml-list] compiler bug?
2006-05-18 18:19 ` skaller
@ 2006-05-18 18:53 ` Jacques Carette
2006-05-19 1:47 ` skaller
0 siblings, 1 reply; 30+ messages in thread
From: Jacques Carette @ 2006-05-18 18:53 UTC (permalink / raw)
To: skaller; +Cc: caml-list
skaller wrote:
>What about high level optimisations?
>
>Felix supports this:
>
> reduce revrev[t] (x:list[t]): rev (rev x) => x;
>
>which, combined with inlining, removes adjacent list reversals.
>This is a fairly trivial example of integrating logic with
>programming as a way of achieving both correctness and
>performance: the reduction above provides both semantic
>knowledge to the reader as well as allowing the compiler
>to generate better code.
>
>
Haskell (GHC to be precise) allows that too. But is syntactic
term-rewriting, in other words it is *untyped*.
Ocaml is great because it is statically typed. MetaOCaml is wonderful
because it is statically typed. Adding an untyped layer to a typed
language seems like a definite step backwards (i.e. a hack)..
Jacques
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [Caml-list] compiler bug?
2006-05-18 18:53 ` Jacques Carette
@ 2006-05-19 1:47 ` skaller
2006-05-19 2:17 ` Brian Hurt
2006-05-19 16:48 ` Jacques Carette
0 siblings, 2 replies; 30+ messages in thread
From: skaller @ 2006-05-19 1:47 UTC (permalink / raw)
To: Jacques Carette; +Cc: caml-list
On Thu, 2006-05-18 at 14:53 -0400, Jacques Carette wrote:
> skaller wrote:
>
> >What about high level optimisations?
> >
> >Felix supports this:
> >
> > reduce revrev[t] (x:list[t]): rev (rev x) => x;
> Haskell (GHC to be precise) allows that too. But is syntactic
> term-rewriting, in other words it is *untyped*.
It's well typed. x:list[t] means x is of type list[t].
--
John Skaller <skaller at users dot sf dot net>
Felix, successor to C++: http://felix.sf.net
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [Caml-list] compiler bug?
2006-05-19 1:47 ` skaller
@ 2006-05-19 2:17 ` Brian Hurt
2006-05-19 3:11 ` skaller
2006-05-19 16:48 ` Jacques Carette
1 sibling, 1 reply; 30+ messages in thread
From: Brian Hurt @ 2006-05-19 2:17 UTC (permalink / raw)
To: skaller; +Cc: Jacques Carette, caml-list
On Fri, 19 May 2006, skaller wrote:
> On Thu, 2006-05-18 at 14:53 -0400, Jacques Carette wrote:
>> skaller wrote:
>>
>>> What about high level optimisations?
>>>
>>> Felix supports this:
>>>
>>> reduce revrev[t] (x:list[t]): rev (rev x) => x;
>
>> Haskell (GHC to be precise) allows that too. But is syntactic
>> term-rewriting, in other words it is *untyped*.
>
> It's well typed. x:list[t] means x is of type list[t].
I'm just wondering how often such optimizations are worthwhile. From what
I've read, basic register allocation and peephole optimizations gain you
2x the performance- after that, it's a fight for percentage points.
Especially considering that humans tend to be a lot better at recognizing
opportunities for high-level optimizations than computers are. For
example, a human would be much more likely to notice a double list
reversal when the two reversals are in seperate modules, or otherwise
obscured from the compiler. Even more so, the programmer may decide that
maybe a list isn't the right datastructure for this data- a decision I
wouldn't trust to a compiler.
Brian
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [Caml-list] compiler bug?
2006-05-19 2:17 ` Brian Hurt
@ 2006-05-19 3:11 ` skaller
0 siblings, 0 replies; 30+ messages in thread
From: skaller @ 2006-05-19 3:11 UTC (permalink / raw)
To: Brian Hurt; +Cc: caml-list, Jacques Carette
On Thu, 2006-05-18 at 22:17 -0400, Brian Hurt wrote:
> >>> reduce revrev[t] (x:list[t]): rev (rev x) => x;
> I'm just wondering how often such optimizations are worthwhile.
Me too. I don't know. However the above type of optimisation
is only a hint at what can be done. It's what I could implement
easily in a few hours, after noting in generated code there
really were cases of double list reversals.
> From what
> I've read, basic register allocation and peephole optimizations gain you
> 2x the performance- after that, it's a fight for percentage points.
Most people seem to think eliminating array bounds checks is
worthwhile.
Haskell certainly thinks deforestation and closure
elimination to be useful. These are high level optimisations
which I expect to have much more impact in most cases
than register fiddling. (They'd better .. GHC is still
a snail solving some very simple problems)
Indeed, the current Felix optimiser was obtained by noting
the woeful code generated for things like the Shootout
multiple loop test, and adding enough heuristics to
eliminate closures so the code reduced to the same
kind of basic C code a C programmer would write.
Of course I hope gcc -O3 etc will do the low level optimisations
for me, but it has to start with decent C code for that
to work.
> Especially considering that humans tend to be a lot better at recognizing
> opportunities for high-level optimizations than computers are. For
> example, a human would be much more likely to notice a double list
> reversal when the two reversals are in seperate modules, or otherwise
> obscured from the compiler. Even more so, the programmer may decide that
> maybe a list isn't the right datastructure for this data- a decision I
> wouldn't trust to a compiler.
I agree. Also note applying even reductions of the form above is
extremely expensive in compile time (matching every subexpression
looking for double list reversals cannot be fast :)
However, the point is that there is scope for much improvement
with high level optimisations.
In particular, one hopes they can be hinted at in a way that
not only makes faster code .. but does so without compromising
correctness or other safety features.
--
John Skaller <skaller at users dot sf dot net>
Felix, successor to C++: http://felix.sf.net
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [Caml-list] compiler bug?
2006-05-19 1:47 ` skaller
2006-05-19 2:17 ` Brian Hurt
@ 2006-05-19 16:48 ` Jacques Carette
2006-05-19 19:10 ` skaller
1 sibling, 1 reply; 30+ messages in thread
From: Jacques Carette @ 2006-05-19 16:48 UTC (permalink / raw)
To: skaller; +Cc: caml-list
skaller wrote:
>>>What about high level optimisations?
>>>
>>>Felix supports this:
>>>
>>> reduce revrev[t] (x:list[t]): rev (rev x) => x;
>>>
>>>
>>Haskell (GHC to be precise) allows that too. But is syntactic
>>term-rewriting, in other words it is *untyped*.
>>
>>
>
>It's well typed. x:list[t] means x is of type list[t].
>
>
The *result* is well-typed. What is the 'type' of the rule (ie the
'function' reduce) ? reduce acts on the programming language syntax,
not on semantic values.
Jacques
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [Caml-list] compiler bug?
2006-05-19 16:48 ` Jacques Carette
@ 2006-05-19 19:10 ` skaller
0 siblings, 0 replies; 30+ messages in thread
From: skaller @ 2006-05-19 19:10 UTC (permalink / raw)
To: Jacques Carette; +Cc: caml-list
On Fri, 2006-05-19 at 12:48 -0400, Jacques Carette wrote:
> skaller wrote:
>
> >>>What about high level optimisations?
> >>>
> >>>Felix supports this:
> >>>
> >>> reduce revrev[t] (x:list[t]): rev (rev x) => x;
> >>>
> >>>
> >>Haskell (GHC to be precise) allows that too. But is syntactic
> >>term-rewriting, in other words it is *untyped*.
> >It's well typed. x:list[t] means x is of type list[t].
> >
> >
> The *result* is well-typed. What is the 'type' of the rule (ie the
> 'function' reduce) ?
> reduce acts on the programming language syntax,
> not on semantic values.
Yes, but so what?
Originally you said it was *untyped*.
As if it were like a macro which rewrites all terms
rev (rev x)
to
x
but this is not the case.
The thing is Felix ALSO has macros which are indeed untyped:
macro fun f(x) = x + x;
so f x is replaced by x + x everywhere the f is in scope.
So when you later said
"Ocaml is great because it is statically typed. MetaOCaml is wonderful
because it is statically typed. Adding an untyped layer to a typed
language seems like a definite step backwards (i.e. a hack).. "
I think you're missing the point: I'd agree with that
comment and also agree it applied to Felix macros,
because they are indeed manipulations of untyped terms.
Macro processing is done first, before typing.
But 'reduce' is done after typing, it manipulates
typed terms: it isn't adding an *untyped* layer
to the system, on the contrary the semantics implied
are actually *beyond* ordinary static typing. We have
learned:
rev ^ 2 = id (revrev)
and this property of 'rev' is a constraint on its semantics
(but of course not a complete specification).
What one might say is that the mechanism is somewhat
ad hoc rather than being systematic.
The problem here, if any, you yourself pointed out
in private email some time ago -- there's no assurance
the rules are actually reductions: one or more such
rules may well diverge.
--
John Skaller <skaller at users dot sf dot net>
Felix, successor to C++: http://felix.sf.net
^ permalink raw reply [flat|nested] 30+ messages in thread
end of thread, other threads:[~2012-08-09 0:53 UTC | newest]
Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-08-06 10:04 [Caml-list] Compiler bug? Dmitry Bely
2012-08-06 10:11 ` Alain Frisch
2012-08-06 10:20 ` Dmitry Bely
2012-08-06 10:34 ` Alain Frisch
2012-08-06 11:03 ` Dmitry Bely
2012-08-06 11:32 ` Alain Frisch
2012-08-06 12:16 ` Dmitry Bely
2012-08-07 1:35 ` Cedric Cellier
2012-08-08 16:03 ` Dmitry Bely
2012-08-08 18:03 ` Alain Frisch
2012-08-08 18:22 ` Jesper Louis Andersen
2012-08-08 18:40 ` Dmitry Bely
2012-08-08 19:29 ` Fabrice Le Fessant
2012-08-08 23:34 ` Anil Madhavapeddy
2012-08-09 0:53 ` Francois Berenger
-- strict thread matches above, loose matches on Subject: below --
2006-05-17 23:14 compiler bug? Dan Koppel
2006-05-17 23:33 ` [Caml-list] " John Carr
2006-05-18 17:15 ` Xavier Leroy
2006-05-18 17:34 ` Jacques Carette
2006-05-18 17:46 ` Xavier Leroy
2006-05-18 19:31 ` Jacques Carette
2006-05-18 20:07 ` David Brown
2006-05-18 20:15 ` Jacques Carette
2006-05-18 20:20 ` Alain Frisch
2006-05-18 18:19 ` skaller
2006-05-18 18:53 ` Jacques Carette
2006-05-19 1:47 ` skaller
2006-05-19 2:17 ` Brian Hurt
2006-05-19 3:11 ` skaller
2006-05-19 16:48 ` Jacques Carette
2006-05-19 19:10 ` skaller
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox