* [Caml-list] native code optimization priorities @ 2001-10-31 3:08 Chris Hecker 2001-10-31 7:50 ` Fabrice Le Fessant 2001-11-06 14:06 ` [Caml-list] native code optimization priorities Xavier Leroy 0 siblings, 2 replies; 7+ messages in thread From: Chris Hecker @ 2001-10-31 3:08 UTC (permalink / raw) To: caml-list Hi, this is just a general question about the caml development team's priorities with respect to the native code compiler's optimized code generation (and bytecode where appropriate), and some specific questions that go along with that. I think optimizations are far less important than new features since Moore's law works on the former but not the latter. So, in some sense, I hope adding new features[*] is prioritized much higher than optimization. However, I have a bunch of small things I'd like to implement (or see implemented) for making native numerical code faster. This is primarily for my video game work, but the kinds of things I have in mind will also help any numerically intensive application. So, here are my questions: 0. How important is optimization to the team? 1. Are there any new (big or small) optimizations planned or in the works? 2. What's the relative priority of new features versus compiler optimizations? 3. Is there some kind of standard suite of test applications the caml team runs to figure out whether an optimization is worth it to include? 4. Are numerical operations an important area for ocaml to succeed? Put another way, if an optimization helps numerical code but does not help other code (or even slightly hurts it), how would that patch be received? What about command line options for optimization (of which there very few now) to offset this affect? 5. How does the team feel about optimizations added to the x86 code generator that don't help other platforms? Thanks, Chris * My personal favorites one more time: overloading, module recursion, generics! ------------------- Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Caml-list] native code optimization priorities 2001-10-31 3:08 [Caml-list] native code optimization priorities Chris Hecker @ 2001-10-31 7:50 ` Fabrice Le Fessant 2001-11-06 14:20 ` [Caml-list] compiler patches in the CDK Xavier Leroy 2001-11-06 14:06 ` [Caml-list] native code optimization priorities Xavier Leroy 1 sibling, 1 reply; 7+ messages in thread From: Fabrice Le Fessant @ 2001-10-31 7:50 UTC (permalink / raw) To: Chris Hecker; +Cc: caml-list I'm not part of the Ocaml devel team, but as an "old" ocaml user, I would reply: > 0. How important is optimization to the team? > 2. What's the relative priority of new features versus compiler > optimizations? Optimizations are welcome, if they don't complexify too much the compiler. > 3. Is there some kind of standard suite of test applications the > 3. caml team runs to figure out whether an optimization is worth > 3. it to include? Look at the CVS version of ocaml, there are test directories I think. Coq compilation is often used for evaluating optimizations. > 4. Are numerical operations an important area for ocaml to > 4. succeed? Put another way, if an optimization helps numerical > 4. code but does not help other code (or even slightly hurts it), > 4. how would that patch be received? What about command line > 4. options for optimization (of which there very few now) to > 4. offset this affect? Most current users look more interested in "symbolic" computations, than in "numerical" applications. However, this might change if you add such an optimization patch. But, if your patch degrades "symbolic" performances, you MUST ADD AN OPTION to trigger it ONLY on numerical applications. Notice that, as discussed before on this mailing-list, I would welcome such a patch in the CDK. > 5. How does the team feel about optimizations added to the x86 > 5. code generator that don't help other platforms? x86 optimization is better than nothing. Finally, I would say it might be interesting to have an optional pass in the compiler, where user-contributed optimizations might be added. Then, there would be some space for an independant project, something like ocaml-opts.sourceforge.net that would develop this pass. Regards, -- Fabrice ------------------- Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Caml-list] compiler patches in the CDK 2001-10-31 7:50 ` Fabrice Le Fessant @ 2001-11-06 14:20 ` Xavier Leroy 2001-11-06 13:49 ` Fabrice Le Fessant 0 siblings, 1 reply; 7+ messages in thread From: Xavier Leroy @ 2001-11-06 14:20 UTC (permalink / raw) To: Fabrice Le Fessant; +Cc: caml-list > > 4. Are numerical operations an important area for ocaml to > > 4. succeed? Put another way, if an optimization helps numerical > > 4. code but does not help other code (or even slightly hurts it), > > 4. how would that patch be received? > > Notice that, as discussed before on this mailing-list, I would welcome > such a patch in the CDK. This is one thing I'm not sure to understand about the CDK. My initial view of the CDK is as a pre-packaged binary installation of OCaml plus lots of user-contributed libraries and tools: a very convenient thing indeed for users who want an OCaml development environment that works and that is rich enough, without the hassle of tracking down and installing all the bits themselves. Excellent idea. But then we learn that the CDK also includes some experimental, not much tested patches to the OCaml compilers, and that by doing this Fabrice intends the CDK to serve also as a beta-test for these experimental extensions and changes. So, is the CDK a stable, convenient distribution for users who want something that works with no hassle, or an experimental distribution for users who want to sit on the bleeding edge and beta-test things? Just curious. - Xavier Leroy ------------------- Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Caml-list] compiler patches in the CDK 2001-11-06 14:20 ` [Caml-list] compiler patches in the CDK Xavier Leroy @ 2001-11-06 13:49 ` Fabrice Le Fessant 0 siblings, 0 replies; 7+ messages in thread From: Fabrice Le Fessant @ 2001-11-06 13:49 UTC (permalink / raw) To: Xavier Leroy; +Cc: caml-list Xavier wrote: > This is one thing I'm not sure to understand about the CDK. > > My initial view of the CDK is as a pre-packaged binary installation of > OCaml plus lots of user-contributed libraries and tools: a very > convenient thing indeed for users who want an OCaml development > environment that works and that is rich enough, without the hassle of > tracking down and installing all the bits themselves. Excellent idea. > > But then we learn that the CDK also includes some experimental, not > much tested patches to the OCaml compilers, and that by doing this > Fabrice intends the CDK to serve also as a beta-test for these > experimental extensions and changes. > > So, is the CDK a stable, convenient distribution for users who > want something that works with no hassle, or an experimental > distribution for users who want to sit on the bleeding edge and > beta-test things? I understand that the idea of untested patches being included in the CDK can frighten users. Two replies: 1) Most patches which were included in the CDK until a recent date were very simple patches, which only modify small well delimited parts of the compiler. Bugs in these patches are very unlikely. However, it is true that I've added some experimental patches very recently, with the idea that the CDK should also welcome contributed patches to the compiler as it welcomes contributed libraries, some of these patches being often asked for on the caml mailing-list. I've tried to read these patches carefully, before including them, to reduce the risk of introducing bugs. In particular, most of them require the use of special keywords or options to trigger them, and so, should not introduce bugs for users that don't use them. 2) As a result of your mail, and of the discussion of this morning, I will remove all experimental patches from the compiler distributed in the CDK. However, since I think some of the experimental patches can still be useful for some users, I will investigate if I can add a second compiler, something like ocamlc-patched and ocamlopt-patched, that will contain some of the patches and still be compatible with object files generated by ocamlc and ocamlopt. Hope this answers your curiosity. - Fabrice ------------------- Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Caml-list] native code optimization priorities 2001-10-31 3:08 [Caml-list] native code optimization priorities Chris Hecker 2001-10-31 7:50 ` Fabrice Le Fessant @ 2001-11-06 14:06 ` Xavier Leroy [not found] ` <20011106154533.D27723@chopin.ai.univie.ac.at> [not found] ` <Pine.SOL.4.20.0111061141330.10389-100000@godzilla.ics.uci.edu> 1 sibling, 2 replies; 7+ messages in thread From: Xavier Leroy @ 2001-11-06 14:06 UTC (permalink / raw) To: Chris Hecker; +Cc: caml-list > However, I have a bunch of small things I'd like to implement (or > see implemented) for making native numerical code faster. This is > primarily for my video game work, but the kinds of things I have in > mind will also help any numerically intensive application. So, here > are my questions: > > 0. How important is optimization to the team? Generating efficient machine code has always been an important aspect of OCaml, and I spent quite a bit of work on this at the beginning of the OCaml development (95-97). Nowadays, we are largely satisfied with the performances of the generated code, and get very few requests for improving it, so this aspect of the OCaml implementation has received little attention recently. Also, I believe we've hit the point of diminishing returns: the major optimizations (that lead to significant speedups on many programs) are already in the ocamlopt compiler; further optimizations would (I believe) result in tiny speedups (less than 5%) or be extremely specific to a couple of test programs. > 1. Are there any new (big or small) optimizations planned or in the works? Not really. Like other members of the OCaml development teams, I have vague ideas about things that could be done, e.g. a Pentium-4 back-end that would use SSE2 registers for floating-point, but this is all low priority. Of course, we are committed to track changes in dominant processor architectures; for instance, if the IA64 becomes widespread (heavens forbid), some effort will have to be invested in cross-basic-block instruction scheduling, if-conversion, and perhaps exploitation of advanced loads. But the fact is that computer architectures viewed from the compiler writer's standpoint haven't changed significantly in the last 5 years: these hardware guys do such a good job of cranking out better and faster processors that require no change in the compiler... > 2. What's the relative priority of new features versus compiler > optimizations? As I said above, the demand for more optimizations is low. Moreover, advanced compiler optimizations require a lot of implementation and testing work. > 3. Is there some kind of standard suite of test applications the > caml team runs to figure out whether an optimization is worth it to > include? I use intensively the small benchmark suite available at: http://camlcvs.inria.fr/cgi-bin/cvsweb.cgi/ocaml/test/ These are mostly small benchmarks, but some of them (KB, fft, nucleic) predict fairly well the performances of bigger applications. The ICFP programming contest entries of the last three years have also been used as benchmarks several times. Finally, the Coq theorem prover stresses quite well the compiler and runtime system as far as symbolic processing is concerned. > 4. Are numerical operations an important area for ocaml to succeed? Although ML is historically rooted in symbolic processing, I did quite a bit of work on the compiler to achieve decent floating-point performance. Still, symbolic processing is OCaml's bread-and-butter, and takes precedence over floating-point performance. > Put another way, if an optimization helps numerical code but does > not help other code (or even slightly hurts it), how would that patch > be received? Does not help: OK. Slightly hurts it: that might be a problem. OCaml contains one instance of this: float arrays are special-cased in a way that improves tremendously the performance of floating-point code, but slows down polymorphic code operating on arrays. I still think this was an acceptable trade-off, but not everyone agrees. Some of my earlier work on type-directed compilation (the Gallium experimental compiler) was abandoned because while it improved the performance of floating-point and integer computations, it slowed down the garbage collector too much, causing pure symbolic processing to take an unacceptable performance hit. > What about command line options for optimization (of which there > very few now) to offset this affect? Only if we absolutely must. The problem with having lots of compiler flags is that it makes testing the compiler much harder -- in principle, all combinations of flags should be tested... > 5. How does the team feel about optimizations added to the x86 code > generator that don't help other platforms? Fine with me. Like all compiler writers, I hate the IA32 architecture, but that's what everyone uses these days. The ocamlopt back-end already contains quite a bit of IA32-specific code (in the instruction selection phase, for instance). Hope this answers your questions. - Xavier Leroy ------------------- Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr ^ permalink raw reply [flat|nested] 7+ messages in thread
[parent not found: <20011106154533.D27723@chopin.ai.univie.ac.at>]
* Re: [Caml-list] native code optimization priorities [not found] ` <20011106154533.D27723@chopin.ai.univie.ac.at> @ 2001-11-08 9:45 ` Xavier Leroy 0 siblings, 0 replies; 7+ messages in thread From: Xavier Leroy @ 2001-11-08 9:45 UTC (permalink / raw) To: Markus Mottl; +Cc: caml-list > Just out of curiosity: what do you as a compiler developer dislike > about the IA64-architecture so much that you said "heavens forbid"? Not > that I have any opinion on this - it's only interesting to learn about > shortcomings of the new architecture. Do you think its features are not > useful for getting even more efficient code out of ocamlopt? Is it just > too complicated a design? This is getting off-topic for this list, but briefly: the IA64 architecture is baroque. It is very complex, provides lots of dubious features (register windows, hardware support for software pipelining, several kinds of load-store speculation), yet lacks some very basic things (such as indirect addressing with immediate displacement). We are very far from the elegance and minimality of classic RISCs such as the Alpha. All these fancy features seem targeted to high-performance Fortran; it is unclear how to exploit them for C, let alone for Caml. Moreover, it relies on the compiler to make instruction parallelism explicit. I believe this is a bad idea compared with what everyone else is doing these days, i.e. discovers instruction parallelism at run-time, in the chip (out-of-order execution). Finally, the first silicon implementation (Itanium) is very late, very expensive, and slower than a $150 Pentium or Athlon for integer code (floating-point performance is excellent, though). Future implementation will probably be better, but still this might indicate something wrong in the design of the architecture. As for using the IA64 features in ocamlopt-generated code, it might be possible to make good use of predication (conditional instructions) for short conditional sequences, and of load speculation (exploiting the fact that a load from an immutable OCaml block cannot interfere with any store). However, both features need new optimization passes that include quite sophisticated heuristics (neither predication nor load speculation are always a win, both can also swamp processor resources with useless instructions). - Xavier Leroy ------------------- Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr ^ permalink raw reply [flat|nested] 7+ messages in thread
[parent not found: <Pine.SOL.4.20.0111061141330.10389-100000@godzilla.ics.uci.edu>]
* Re: [Caml-list] native code optimization priorities [not found] ` <Pine.SOL.4.20.0111061141330.10389-100000@godzilla.ics.uci.edu> @ 2001-11-08 9:59 ` Xavier Leroy 0 siblings, 0 replies; 7+ messages in thread From: Xavier Leroy @ 2001-11-08 9:59 UTC (permalink / raw) To: Niall Dalton; +Cc: caml-list > > I have > > vague ideas about things that could be done, e.g. a Pentium-4 back-end > > that would use SSE2 registers for floating-point, but this is all > > low priority. > > May I ask if you ever did implement this, would you limit it to some > P4 specific technique? I've idly toyed with the idea of implementing > something for Altivec on the G4. I'm afraid I wasn't clear enough: the first step would be to use SSE2 registers as normal floating-point registers, storing only one float per register, and performing single floating-point operations. This would already improve float performance quite a lot compared with the current x86 float stack. Other processors do not need this hack, because they already have a sensible register-based float architecture. The next step, of course, would be to actually use SIMD instructions to operate on pairs or quadruples of floats. The standard approach would be to have special abstract types for these packed floats, with operations corresponding to what the hardware SIMD unit provides. The problem here is that of portability: SSE2 and Altivec, for instance, do not provide the same SIMD instructions... > I wondered if it would be possible > to integrate this into the type inference; if the compiler can infer > that certain values will never require more than a certain number of > bits they become candidates for use in a SIMD unit. This is along the > lines of Bitwidth Analysis (PLDI'00 Stephenson et al, and Larsen and > Amarasinghe's Exploting Superword Level Parallelism with Multimedia > Instruction Sets, same conference). Scott Ananian's SM thesis at MIT > also included a predicated (forward and reverse) SSA variant that used > a similar optimization to find narrow operations that could be executed in > parallel. We're getting into really advanced stuff here! It's a research topic on its own, and I somewhat doubt that we can extract much parallelism this way, but we'll see. - Xavier Leroy ------------------- Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2001-11-08 9:59 UTC | newest] Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2001-10-31 3:08 [Caml-list] native code optimization priorities Chris Hecker 2001-10-31 7:50 ` Fabrice Le Fessant 2001-11-06 14:20 ` [Caml-list] compiler patches in the CDK Xavier Leroy 2001-11-06 13:49 ` Fabrice Le Fessant 2001-11-06 14:06 ` [Caml-list] native code optimization priorities Xavier Leroy [not found] ` <20011106154533.D27723@chopin.ai.univie.ac.at> 2001-11-08 9:45 ` Xavier Leroy [not found] ` <Pine.SOL.4.20.0111061141330.10389-100000@godzilla.ics.uci.edu> 2001-11-08 9:59 ` Xavier Leroy
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox