* compiler bug? @ 2006-05-17 23:14 Dan Koppel 2006-05-17 23:33 ` [Caml-list] " John Carr 2006-05-18 17:15 ` Xavier Leroy 0 siblings, 2 replies; 16+ messages in thread From: Dan Koppel @ 2006-05-17 23:14 UTC (permalink / raw) To: caml-list [-- Attachment #1: Type: text/plain, Size: 1470 bytes --] Hello Caml Group, I would like to report what I think might be a bug in the Ocaml compiler. But first I wanted to run this by this group in case there's something I'm missing. I have some very simple code that consists of 2 nested loops. Inside the inner loop, is a simple statement. Furthermore, the inner loop is not "tight". Ie. the number of iterations within the inner loop is very large and the number of iterations of the outer loop is very small. I then manually time this. I then change the code by inserting a simple function call between the inner and outer loops. This should have virtually no effect whatsoever. However, when I time this, I get exactly twice the time. This is somewhat inexplicable. I tried tinkering with the "-inline" option for ocamlopt but this had no effect. Below is the actual code (main.ml): let main () = let dummy1 = ref 0 in let dummy2 = ref 0.0 in for i = 1 to 4 do for j = 1 to 1000000000 do dummy1 := !dummy1 + 1; dummy1 := !dummy1 - 1 done; dummy2 := Unix.gettimeofday () done let _ = main () I compile as follows: ocamlopt unix.cmxa main.ml and run: ./a.out Is this in fact a bug of the ocamlopt compiler? Or is there some way currently to make this effect disappear? Thanks for any help! Sincerely, Dan Koppel --------------------------------- Sneak preview the all-new Yahoo.com. It's not radically different. Just radically better. [-- Attachment #2: Type: text/html, Size: 1907 bytes --] ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Caml-list] compiler bug? 2006-05-17 23:14 compiler bug? Dan Koppel @ 2006-05-17 23:33 ` John Carr 2006-05-18 17:15 ` Xavier Leroy 1 sibling, 0 replies; 16+ messages in thread From: John Carr @ 2006-05-17 23:33 UTC (permalink / raw) To: Dan Koppel; +Cc: caml-list On SPARC the presence of the function call in the outer loop causes the code generated for the inner loop to change so the dummy1 variable is stored on the stack instead of in a register. Each loop iteration loads dummy1, modifies it, and stores it back onto the stack. The store-load hazard, loading a value that is in the store buffer, adds a large delay. The loop runs in half the time if I comment out either the store or the load in the assembly. If the inner loop did more computation the effect would be much less. This is surprising but not strictly a bug. Xavier Leroy has posted about similar minor changes causing the compiler to box or unbox a floating point value with major changes in performance. > I would like to report what I think might be a bug in the Ocaml compiler. But first I wanted to run this by this group in case there's something I'm missing. I have some very simple code that consists of 2 nested loops. Inside the inner loop, is a simple statement. Furthermore, the inner loop is not "tight". Ie. the number of iterations within the inner loop is very large and the number of iterations of the outer loop is very small. I then manually time this. I then change the code by inserting a simple function call between the inner and outer loops. This should have virtually no effect whatsoever. However, when I time this, I get exactly twice the time. This is somewhat inexplicable. I tried tinkering with the "-inline" option for ocamlopt but this had no effect. Below is the actual code (main.ml): > > let main () = > let dummy1 = ref 0 in > let dummy2 = ref 0.0 in > for i = 1 to 4 do > for j = 1 to 1000000000 do > dummy1 := !dummy1 + 1; > dummy1 := !dummy1 - 1 > done; > dummy2 := Unix.gettimeofday () > done > > let _ = main () > > I compile as follows: ocamlopt unix.cmxa main.ml > and run: ./a.out > > Is this in fact a bug of the ocamlopt compiler? Or is there some way currently to make this effect disappear? ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Caml-list] compiler bug? 2006-05-17 23:14 compiler bug? Dan Koppel 2006-05-17 23:33 ` [Caml-list] " John Carr @ 2006-05-18 17:15 ` Xavier Leroy 2006-05-18 17:34 ` Jacques Carette 1 sibling, 1 reply; 16+ messages in thread From: Xavier Leroy @ 2006-05-18 17:15 UTC (permalink / raw) To: Dan Koppel; +Cc: caml-list > I would like to report what I think might be a bug in the Ocaml > compiler. But first I wanted to run this by this group in case there's > something I'm missing. I have some very simple code that consists of 2 > nested loops. Inside the inner loop, is a simple statement. > Furthermore, the inner loop is not "tight". Ie. the number of > iterations within the inner loop is very large and the number of > iterations of the outer loop is very small. I then manually time this. > I then change the code by inserting a simple function call between the > inner and outer loops. This should have virtually no effect > whatsoever. However, when I time this, I get exactly twice the time. > This is somewhat inexplicable. As John Carr mentioned, the function call forces the variables that are live across the call (i.e. dummy1) to be spilled, causing one additional memory read and one additional write at each iteration of the inner loop. Since your inner loop is approximately 4 integer instructions, the additional memory traffic can easily halve performance. I grant you that spill code generation could be more clever here and spill/reload around the whole inner loop. But, no, it's not a bug: Ocaml has a bug when the generated code crashes or produces incorrect results. > I learned in basic compiler theory that when a function is called, you > save all the registers before entering the function. That's an over-simplification. First, many compilers designate a number of registers as "callee-save", meaning that all functions must preserve their values. These would be preserved across a call. It happens that ocamlopt does not implement callee-save registers, as they make garbage collection (more exactly, stack traversal) and exception handling more expensive. But even when a function call destroys all registers, as with ocamlopt, there are many possible placements for spills (saving a register in the stack) and reloads (of a register from the stack), and the placement you suggest (only around calls) is rarely optimal. Well, in your example it is :-) Generation of spill code is a dark corner of compiler construction. As with many other compilation problems, there are a bunch of NP-completeness results. Unlike many other compilation problems, I'm not aware of any published heuristics that works well in most cases. Well, there is the paper by George and Appel where they use an ILP solver (integer linear programming) to produce optimal spills, but that's kind of expensive... - Xavier Leroy ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Caml-list] compiler bug? 2006-05-18 17:15 ` Xavier Leroy @ 2006-05-18 17:34 ` Jacques Carette 2006-05-18 17:46 ` Xavier Leroy 2006-05-18 18:19 ` skaller 0 siblings, 2 replies; 16+ messages in thread From: Jacques Carette @ 2006-05-18 17:34 UTC (permalink / raw) To: caml-list Xavier Leroy wrote: >Generation of spill code is a dark corner of compiler construction. >As with many other compilation problems, there are a bunch of >NP-completeness results. Unlike many other compilation problems, I'm >not aware of any published heuristics that works well in most cases. >Well, there is the paper by George and Appel where they use an ILP >solver (integer linear programming) to produce optimal spills, but >that's kind of expensive... > > It is my impression that users of compilers are "ready" for the following situation: 1) an optimizing compiler (like ocamlopt!) that produces good code efficiently 2) a super-optimizing compiler that produces fantastic code, at whatever cost. Such a compiler would probably rapidly find a niche of fervent users. Jacques ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Caml-list] compiler bug? 2006-05-18 17:34 ` Jacques Carette @ 2006-05-18 17:46 ` Xavier Leroy 2006-05-18 19:31 ` Jacques Carette 2006-05-18 18:19 ` skaller 1 sibling, 1 reply; 16+ messages in thread From: Xavier Leroy @ 2006-05-18 17:46 UTC (permalink / raw) To: Jacques Carette; +Cc: caml-list >> Well, there is the paper by George and Appel where they use an ILP >> solver (integer linear programming) to produce optimal spills, but >> that's kind of expensive... > > It is my impression that users of compilers are "ready" for the > following situation: > 1) an optimizing compiler (like ocamlopt!) that produces good code > efficiently > 2) a super-optimizing compiler that produces fantastic code, at whatever > cost. Clearly, you're not the guy who would have to support both compilers :-) More seriously, I agree that optional "super-optimization" passes could be useful in some cases. Actually, George and Appel found that compilation times with their approach were almost reasonable (e.g. a few minutes instead of a few seconds for a standard compiler), but they had to use a commercial ILP solver. If only there were *really good* ILP and SAT solvers under free licenses... - Xavier Leroy ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Caml-list] compiler bug? 2006-05-18 17:46 ` Xavier Leroy @ 2006-05-18 19:31 ` Jacques Carette 2006-05-18 20:07 ` David Brown 0 siblings, 1 reply; 16+ messages in thread From: Jacques Carette @ 2006-05-18 19:31 UTC (permalink / raw) To: Xavier Leroy; +Cc: caml-list Xavier Leroy wrote: >Clearly, you're not the guy who would have to support both compilers :-) > > Clearly :-). [Although I've done my time maintaining ancient software used by ~2 million users, so you only get so much sympathy from me ;-) ] >Actually, George and Appel found that compilation times with their >approach were almost reasonable (e.g. a few minutes instead of a few >seconds for a standard compiler), but they had to use a commercial ILP >solver. If only there were *really good* ILP and SAT solvers under free >licenses... > > In Computer Algebra, people use Groebner bases all the time. They have doubly-exponential worst-case complexity -- but seem to work rather well in practice. So I have stopped paying attention to worst-case; average case, when available, does matter a lot more. For ILP, I have found http://www-unix.mcs.anl.gov/otc/Guide/faq/linear-programming-faq.html#Q2 to be quite informative about current sources of "free" ILP solvers. Of particular interest are: GLPK: http://www.gnu.org/software/glpk/glpk.html lp_solve: http://groups.yahoo.com/group/lp_solve/ For SAT, things are weirder. Of course there is http://www.satlib.org/ as well as SVC http://chicory.stanford.edu/SVC/ and CVC Lite http://www.cs.nyu.edu/acsys/cvcl/ (these last 2 for SMT rather than pure SAT) which are under compatible licenses. But at http://www.qbflib.org/ and http://www.satlive.org/ there are a number of additional candidates. Of course, getting an agreement with SRI to use Yices (http://fm.csl.sri.com/yices/) would go a long way towards satisfying the "really good" requirement... Jacques ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Caml-list] compiler bug? 2006-05-18 19:31 ` Jacques Carette @ 2006-05-18 20:07 ` David Brown 2006-05-18 20:15 ` Jacques Carette 2006-05-18 20:20 ` Alain Frisch 0 siblings, 2 replies; 16+ messages in thread From: David Brown @ 2006-05-18 20:07 UTC (permalink / raw) To: Jacques Carette; +Cc: Xavier Leroy, caml-list On Thu, May 18, 2006 at 03:31:26PM -0400, Jacques Carette wrote: > In Computer Algebra, people use Groebner bases all the time. They have > doubly-exponential worst-case complexity -- but seem to work rather well > in practice. So I have stopped paying attention to worst-case; average > case, when available, does matter a lot more. Except when someone made a decision like this, in say a revision control system, and suddenly you discover that you've provoked a worst case scenario, and it suddenly takes hundreds of cpu hours to check in a file rather than a few seconds. For something like a compiler, worse case behavior is very important. It is not generally acceptable for a build to just hang in the compiler. Dave ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Caml-list] compiler bug? 2006-05-18 20:07 ` David Brown @ 2006-05-18 20:15 ` Jacques Carette 2006-05-18 20:20 ` Alain Frisch 1 sibling, 0 replies; 16+ messages in thread From: Jacques Carette @ 2006-05-18 20:15 UTC (permalink / raw) To: David Brown; +Cc: caml-list David Brown wrote: >>In Computer Algebra, people use Groebner bases all the time. They have >>doubly-exponential worst-case complexity -- but seem to work rather well >>in practice. So I have stopped paying attention to worst-case; average >>case, when available, does matter a lot more. >> >> > >Except when someone made a decision like this, in say a revision control >system, and suddenly you discover that you've provoked a worst case >scenario, and it suddenly takes hundreds of cpu hours to check in a file >rather than a few seconds. > > I see you've used ClearCase! Rather unpleasant experience. >For something like a compiler, worse case behavior is very important. It >is not generally acceptable for a build to just hang in the compiler. > Not the compiler, the super-optimizing pass in the compiler. I completely agree that the base compiler should not hang a build, and that complexity (including worst-case) there matters a lot. Speaking of which, have you ever tried to compile an Ocaml program with a LOT of functors in it? It requires quite a bit of patience... Jacques ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Caml-list] compiler bug? 2006-05-18 20:07 ` David Brown 2006-05-18 20:15 ` Jacques Carette @ 2006-05-18 20:20 ` Alain Frisch 1 sibling, 0 replies; 16+ messages in thread From: Alain Frisch @ 2006-05-18 20:20 UTC (permalink / raw) To: David Brown; +Cc: caml-list David Brown wrote: > For something like a compiler, worse case behavior is very important. All the decent programming languages I'm using nowadays have a type system whose worse case complexity is exponential in the size of the source. No compiler can be better than exponential. Do you think I should stop using these languages? -- Alain ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Caml-list] compiler bug? 2006-05-18 17:34 ` Jacques Carette 2006-05-18 17:46 ` Xavier Leroy @ 2006-05-18 18:19 ` skaller 2006-05-18 18:53 ` Jacques Carette 1 sibling, 1 reply; 16+ messages in thread From: skaller @ 2006-05-18 18:19 UTC (permalink / raw) To: Jacques Carette; +Cc: caml-list On Thu, 2006-05-18 at 13:34 -0400, Jacques Carette wrote: > It is my impression that users of compilers are "ready" for the > following situation: > 1) an optimizing compiler (like ocamlopt!) that produces good code > efficiently > 2) a super-optimizing compiler that produces fantastic code, at whatever > cost. > > Such a compiler would probably rapidly find a niche of fervent users. What about high level optimisations? Felix supports this: reduce revrev[t] (x:list[t]): rev (rev x) => x; which, combined with inlining, removes adjacent list reversals. This is a fairly trivial example of integrating logic with programming as a way of achieving both correctness and performance: the reduction above provides both semantic knowledge to the reader as well as allowing the compiler to generate better code. -- John Skaller <skaller at users dot sf dot net> Felix, successor to C++: http://felix.sf.net ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Caml-list] compiler bug? 2006-05-18 18:19 ` skaller @ 2006-05-18 18:53 ` Jacques Carette 2006-05-19 1:47 ` skaller 0 siblings, 1 reply; 16+ messages in thread From: Jacques Carette @ 2006-05-18 18:53 UTC (permalink / raw) To: skaller; +Cc: caml-list skaller wrote: >What about high level optimisations? > >Felix supports this: > > reduce revrev[t] (x:list[t]): rev (rev x) => x; > >which, combined with inlining, removes adjacent list reversals. >This is a fairly trivial example of integrating logic with >programming as a way of achieving both correctness and >performance: the reduction above provides both semantic >knowledge to the reader as well as allowing the compiler >to generate better code. > > Haskell (GHC to be precise) allows that too. But is syntactic term-rewriting, in other words it is *untyped*. Ocaml is great because it is statically typed. MetaOCaml is wonderful because it is statically typed. Adding an untyped layer to a typed language seems like a definite step backwards (i.e. a hack).. Jacques ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Caml-list] compiler bug? 2006-05-18 18:53 ` Jacques Carette @ 2006-05-19 1:47 ` skaller 2006-05-19 2:17 ` Brian Hurt 2006-05-19 16:48 ` Jacques Carette 0 siblings, 2 replies; 16+ messages in thread From: skaller @ 2006-05-19 1:47 UTC (permalink / raw) To: Jacques Carette; +Cc: caml-list On Thu, 2006-05-18 at 14:53 -0400, Jacques Carette wrote: > skaller wrote: > > >What about high level optimisations? > > > >Felix supports this: > > > > reduce revrev[t] (x:list[t]): rev (rev x) => x; > Haskell (GHC to be precise) allows that too. But is syntactic > term-rewriting, in other words it is *untyped*. It's well typed. x:list[t] means x is of type list[t]. -- John Skaller <skaller at users dot sf dot net> Felix, successor to C++: http://felix.sf.net ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Caml-list] compiler bug? 2006-05-19 1:47 ` skaller @ 2006-05-19 2:17 ` Brian Hurt 2006-05-19 3:11 ` skaller 2006-05-19 16:48 ` Jacques Carette 1 sibling, 1 reply; 16+ messages in thread From: Brian Hurt @ 2006-05-19 2:17 UTC (permalink / raw) To: skaller; +Cc: Jacques Carette, caml-list On Fri, 19 May 2006, skaller wrote: > On Thu, 2006-05-18 at 14:53 -0400, Jacques Carette wrote: >> skaller wrote: >> >>> What about high level optimisations? >>> >>> Felix supports this: >>> >>> reduce revrev[t] (x:list[t]): rev (rev x) => x; > >> Haskell (GHC to be precise) allows that too. But is syntactic >> term-rewriting, in other words it is *untyped*. > > It's well typed. x:list[t] means x is of type list[t]. I'm just wondering how often such optimizations are worthwhile. From what I've read, basic register allocation and peephole optimizations gain you 2x the performance- after that, it's a fight for percentage points. Especially considering that humans tend to be a lot better at recognizing opportunities for high-level optimizations than computers are. For example, a human would be much more likely to notice a double list reversal when the two reversals are in seperate modules, or otherwise obscured from the compiler. Even more so, the programmer may decide that maybe a list isn't the right datastructure for this data- a decision I wouldn't trust to a compiler. Brian ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Caml-list] compiler bug? 2006-05-19 2:17 ` Brian Hurt @ 2006-05-19 3:11 ` skaller 0 siblings, 0 replies; 16+ messages in thread From: skaller @ 2006-05-19 3:11 UTC (permalink / raw) To: Brian Hurt; +Cc: caml-list, Jacques Carette On Thu, 2006-05-18 at 22:17 -0400, Brian Hurt wrote: > >>> reduce revrev[t] (x:list[t]): rev (rev x) => x; > I'm just wondering how often such optimizations are worthwhile. Me too. I don't know. However the above type of optimisation is only a hint at what can be done. It's what I could implement easily in a few hours, after noting in generated code there really were cases of double list reversals. > From what > I've read, basic register allocation and peephole optimizations gain you > 2x the performance- after that, it's a fight for percentage points. Most people seem to think eliminating array bounds checks is worthwhile. Haskell certainly thinks deforestation and closure elimination to be useful. These are high level optimisations which I expect to have much more impact in most cases than register fiddling. (They'd better .. GHC is still a snail solving some very simple problems) Indeed, the current Felix optimiser was obtained by noting the woeful code generated for things like the Shootout multiple loop test, and adding enough heuristics to eliminate closures so the code reduced to the same kind of basic C code a C programmer would write. Of course I hope gcc -O3 etc will do the low level optimisations for me, but it has to start with decent C code for that to work. > Especially considering that humans tend to be a lot better at recognizing > opportunities for high-level optimizations than computers are. For > example, a human would be much more likely to notice a double list > reversal when the two reversals are in seperate modules, or otherwise > obscured from the compiler. Even more so, the programmer may decide that > maybe a list isn't the right datastructure for this data- a decision I > wouldn't trust to a compiler. I agree. Also note applying even reductions of the form above is extremely expensive in compile time (matching every subexpression looking for double list reversals cannot be fast :) However, the point is that there is scope for much improvement with high level optimisations. In particular, one hopes they can be hinted at in a way that not only makes faster code .. but does so without compromising correctness or other safety features. -- John Skaller <skaller at users dot sf dot net> Felix, successor to C++: http://felix.sf.net ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Caml-list] compiler bug? 2006-05-19 1:47 ` skaller 2006-05-19 2:17 ` Brian Hurt @ 2006-05-19 16:48 ` Jacques Carette 2006-05-19 19:10 ` skaller 1 sibling, 1 reply; 16+ messages in thread From: Jacques Carette @ 2006-05-19 16:48 UTC (permalink / raw) To: skaller; +Cc: caml-list skaller wrote: >>>What about high level optimisations? >>> >>>Felix supports this: >>> >>> reduce revrev[t] (x:list[t]): rev (rev x) => x; >>> >>> >>Haskell (GHC to be precise) allows that too. But is syntactic >>term-rewriting, in other words it is *untyped*. >> >> > >It's well typed. x:list[t] means x is of type list[t]. > > The *result* is well-typed. What is the 'type' of the rule (ie the 'function' reduce) ? reduce acts on the programming language syntax, not on semantic values. Jacques ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Caml-list] compiler bug? 2006-05-19 16:48 ` Jacques Carette @ 2006-05-19 19:10 ` skaller 0 siblings, 0 replies; 16+ messages in thread From: skaller @ 2006-05-19 19:10 UTC (permalink / raw) To: Jacques Carette; +Cc: caml-list On Fri, 2006-05-19 at 12:48 -0400, Jacques Carette wrote: > skaller wrote: > > >>>What about high level optimisations? > >>> > >>>Felix supports this: > >>> > >>> reduce revrev[t] (x:list[t]): rev (rev x) => x; > >>> > >>> > >>Haskell (GHC to be precise) allows that too. But is syntactic > >>term-rewriting, in other words it is *untyped*. > >It's well typed. x:list[t] means x is of type list[t]. > > > > > The *result* is well-typed. What is the 'type' of the rule (ie the > 'function' reduce) ? > reduce acts on the programming language syntax, > not on semantic values. Yes, but so what? Originally you said it was *untyped*. As if it were like a macro which rewrites all terms rev (rev x) to x but this is not the case. The thing is Felix ALSO has macros which are indeed untyped: macro fun f(x) = x + x; so f x is replaced by x + x everywhere the f is in scope. So when you later said "Ocaml is great because it is statically typed. MetaOCaml is wonderful because it is statically typed. Adding an untyped layer to a typed language seems like a definite step backwards (i.e. a hack).. " I think you're missing the point: I'd agree with that comment and also agree it applied to Felix macros, because they are indeed manipulations of untyped terms. Macro processing is done first, before typing. But 'reduce' is done after typing, it manipulates typed terms: it isn't adding an *untyped* layer to the system, on the contrary the semantics implied are actually *beyond* ordinary static typing. We have learned: rev ^ 2 = id (revrev) and this property of 'rev' is a constraint on its semantics (but of course not a complete specification). What one might say is that the mechanism is somewhat ad hoc rather than being systematic. The problem here, if any, you yourself pointed out in private email some time ago -- there's no assurance the rules are actually reductions: one or more such rules may well diverge. -- John Skaller <skaller at users dot sf dot net> Felix, successor to C++: http://felix.sf.net ^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2006-05-19 19:10 UTC | newest] Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2006-05-17 23:14 compiler bug? Dan Koppel 2006-05-17 23:33 ` [Caml-list] " John Carr 2006-05-18 17:15 ` Xavier Leroy 2006-05-18 17:34 ` Jacques Carette 2006-05-18 17:46 ` Xavier Leroy 2006-05-18 19:31 ` Jacques Carette 2006-05-18 20:07 ` David Brown 2006-05-18 20:15 ` Jacques Carette 2006-05-18 20:20 ` Alain Frisch 2006-05-18 18:19 ` skaller 2006-05-18 18:53 ` Jacques Carette 2006-05-19 1:47 ` skaller 2006-05-19 2:17 ` Brian Hurt 2006-05-19 3:11 ` skaller 2006-05-19 16:48 ` Jacques Carette 2006-05-19 19:10 ` skaller
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox