* [Caml-list] What is triggering a lot of GC work? @ 2013-02-25 2:08 Francois Berenger 2013-02-25 8:02 ` Mark Shinwell 2013-02-25 13:31 ` AW: " Gerd Stolpmann 0 siblings, 2 replies; 10+ messages in thread From: Francois Berenger @ 2013-02-25 2:08 UTC (permalink / raw) To: caml-list Hello, Is there a way to profile a program in order to know which places in the source code trigger a lot of garbage collection work? I've seen some profiling traces of OCaml programs of mine, sometimes the trace is very flat, and the obvious things are only GC-related. I think it may mean some performance-critical part is written in a functional style and may benefit from some more imperative style. Regards, F. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Caml-list] What is triggering a lot of GC work? 2013-02-25 2:08 [Caml-list] What is triggering a lot of GC work? Francois Berenger @ 2013-02-25 8:02 ` Mark Shinwell 2013-02-25 10:32 ` ygrek 2013-02-25 13:31 ` AW: " Gerd Stolpmann 1 sibling, 1 reply; 10+ messages in thread From: Mark Shinwell @ 2013-02-25 8:02 UTC (permalink / raw) To: Francois Berenger; +Cc: caml-list On 25 February 2013 02:08, Francois Berenger <berenger@riken.jp> wrote: > Is there a way to profile a program in order > to know which places in the source code > trigger a lot of garbage collection work? Well, as of last week, there is! I'm working on a compiler and runtime patch which allows the identification, without excessive overhead, of every location (source file name / line number) which causes a minor or major heap allocation together with the number of words allocated at that point. There should be something available within the next couple of weeks. It only works on native code compiled for x86-64 machines at present. Currently it has only been tested on Linux---although I expect it to work on other Unix-like platforms with little or no modification. Mark ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Caml-list] What is triggering a lot of GC work? 2013-02-25 8:02 ` Mark Shinwell @ 2013-02-25 10:32 ` ygrek 2013-02-26 3:46 ` Francois Berenger 0 siblings, 1 reply; 10+ messages in thread From: ygrek @ 2013-02-25 10:32 UTC (permalink / raw) To: caml-list On Mon, 25 Feb 2013 08:02:54 +0000 Mark Shinwell <mshinwell@janestreet.com> wrote: > On 25 February 2013 02:08, Francois Berenger <berenger@riken.jp> wrote: > > Is there a way to profile a program in order > > to know which places in the source code > > trigger a lot of garbage collection work? > > Well, as of last week, there is! > > I'm working on a compiler and runtime patch which allows the > identification, without excessive overhead, of every location (source > file name / line number) which causes a minor or major heap allocation > together with the number of words allocated at that point. > > There should be something available within the next couple of weeks. > It only works on native code compiled for x86-64 machines at present. > Currently it has only been tested on Linux---although I expect it to > work on other Unix-like platforms with little or no modification. Meanwhile you can use poor man's allocation profiler : - http://ygrek.org.ua/p/code/pmpa - https://sympa-roc.inria.fr/wws/arc/caml-list/2011-08/msg00050.html -- ygrek http://ygrek.org.ua ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Caml-list] What is triggering a lot of GC work? 2013-02-25 10:32 ` ygrek @ 2013-02-26 3:46 ` Francois Berenger 2013-02-26 4:29 ` ygrek 0 siblings, 1 reply; 10+ messages in thread From: Francois Berenger @ 2013-02-26 3:46 UTC (permalink / raw) To: caml-list On 02/25/2013 07:32 PM, ygrek wrote: > On Mon, 25 Feb 2013 08:02:54 +0000 > Mark Shinwell <mshinwell@janestreet.com> wrote: > >> On 25 February 2013 02:08, Francois Berenger <berenger@riken.jp> wrote: >>> Is there a way to profile a program in order >>> to know which places in the source code >>> trigger a lot of garbage collection work? >> >> Well, as of last week, there is! >> >> I'm working on a compiler and runtime patch which allows the >> identification, without excessive overhead, of every location (source >> file name / line number) which causes a minor or major heap allocation >> together with the number of words allocated at that point. >> >> There should be something available within the next couple of weeks. >> It only works on native code compiled for x86-64 machines at present. >> Currently it has only been tested on Linux---although I expect it to >> work on other Unix-like platforms with little or no modification. > > Meanwhile you can use poor man's allocation profiler : > - http://ygrek.org.ua/p/code/pmpa > - https://sympa-roc.inria.fr/wws/arc/caml-list/2011-08/msg00050.html Did the changes reported on mldonkey to do less allocations had a significant impact on performances? ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Caml-list] What is triggering a lot of GC work? 2013-02-26 3:46 ` Francois Berenger @ 2013-02-26 4:29 ` ygrek 0 siblings, 0 replies; 10+ messages in thread From: ygrek @ 2013-02-26 4:29 UTC (permalink / raw) To: Francois Berenger; +Cc: caml-list On Tue, 26 Feb 2013 12:46:05 +0900 Francois Berenger <berenger@riken.jp> wrote: > On 02/25/2013 07:32 PM, ygrek wrote: > > On Mon, 25 Feb 2013 08:02:54 +0000 > > Mark Shinwell <mshinwell@janestreet.com> wrote: > > > >> On 25 February 2013 02:08, Francois Berenger <berenger@riken.jp> wrote: > >>> Is there a way to profile a program in order > >>> to know which places in the source code > >>> trigger a lot of garbage collection work? > >> > >> Well, as of last week, there is! > >> > >> I'm working on a compiler and runtime patch which allows the > >> identification, without excessive overhead, of every location (source > >> file name / line number) which causes a minor or major heap allocation > >> together with the number of words allocated at that point. > >> > >> There should be something available within the next couple of weeks. > >> It only works on native code compiled for x86-64 machines at present. > >> Currently it has only been tested on Linux---although I expect it to > >> work on other Unix-like platforms with little or no modification. > > > > Meanwhile you can use poor man's allocation profiler : > > - http://ygrek.org.ua/p/code/pmpa > > - https://sympa-roc.inria.fr/wws/arc/caml-list/2011-08/msg00050.html > > Did the changes reported on mldonkey to do less allocations had a > significant impact on performances? Unfortunately, there was no feedback from the users of embedded versions of mldonkey, who have constrained memory and cpu resources, but there were no performance problems reported since then either, so it is hard to tell. -- ygrek http://ygrek.org.ua ^ permalink raw reply [flat|nested] 10+ messages in thread
* AW: [Caml-list] What is triggering a lot of GC work? 2013-02-25 2:08 [Caml-list] What is triggering a lot of GC work? Francois Berenger 2013-02-25 8:02 ` Mark Shinwell @ 2013-02-25 13:31 ` Gerd Stolpmann 2013-02-25 15:45 ` Alain Frisch 1 sibling, 1 reply; 10+ messages in thread From: Gerd Stolpmann @ 2013-02-25 13:31 UTC (permalink / raw) To: Francois Berenger; +Cc: caml-list Am 25.02.2013 03:08:14 schrieb(en) Francois Berenger: > Hello, > > Is there a way to profile a program in order > to know which places in the source code > trigger a lot of garbage collection work? > > I've seen some profiling traces of OCaml programs > of mine, sometimes the trace is very flat, > and the obvious things are only GC-related. > > I think it may mean some performance-critical part > is written in a functional style and may benefit > from some more imperative style. This is really a hard question, and I fear an allocation profiler cannot always answer it. Imperative style means to use assignments, and assignments have often to go through caml_modify, and are not as cheap as you would think. In contrast, allocating something new can usually avoid caml_modify. This can have counter-intuitive consequences. Yesterday I sped an imperative program up by adding allocations! The idea is so strange that I need to report it here. The program uses an array for storing intermediate values. Originally, there was only one such array, and sooner or later this array was moved to the major heap by the GC. Assigning the elements of an array in the major heap with young values is the most expensive form of assignment - the array elements are temporarily registered as roots by the OCaml runtime. So my idea was to create a fresh copy of the array now and then so it is more often in the minor heap (the array was quite small). Assignments within the minor heap are cheaper - no root registration. The program was 10% faster finally. My general experience is that optimizing the memory behavior is one of the most difficult tasks, especially because the OCaml runtime is designed for functional programming, and short-living allocations are really cheap. Usual rules like "assignment is cheaper than new allocation" just do not hold. It depends. Gerd > > Regards, > F. > >-- > Caml-list mailing list. Subscription management and archives: > https://sympa.inria.fr/sympa/arc/caml-list > Beginner's list: http://groups.yahoo.com/group/ocaml_beginners > Bug reports: http://caml.inria.fr/bin/caml-bugs > -- ------------------------------------------------------------ Gerd Stolpmann, Darmstadt, Germany gerd@gerd-stolpmann.de Creator of GODI and camlcity.org. Contact details: http://www.camlcity.org/contact.html Company homepage: http://www.gerd-stolpmann.de ------------------------------------------------------------ ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: AW: [Caml-list] What is triggering a lot of GC work? 2013-02-25 13:31 ` AW: " Gerd Stolpmann @ 2013-02-25 15:45 ` Alain Frisch 2013-02-25 16:26 ` Gerd Stolpmann 2013-02-25 16:32 ` Gabriel Scherer 0 siblings, 2 replies; 10+ messages in thread From: Alain Frisch @ 2013-02-25 15:45 UTC (permalink / raw) To: Gerd Stolpmann; +Cc: Francois Berenger, caml-list On 02/25/2013 02:31 PM, Gerd Stolpmann wrote: > This can have counter-intuitive consequences. Yesterday I sped an > imperative program up by adding allocations! This is really an interesting scenario, thanks for sharing! Two other approaches to addressing the same performance issue could have been: 1. increase the size of the minor heap so that your array stays in it long enough; 2. try to reduce the number of other allocations. Did you try one of these approaches as well? (1 in particular is particularly easy to test.) Gabriel Scherer recently called the community to share representative "benchmarks", in order to help core developers target optimization efforts to where they are useful: http://gallium.inria.fr/~scherer/gagallium/we-need-a-representative-benchmark-suite/ Gabriel: except from LexiFi's contribution, did you get any code? Gerd: it would be great if you could share the code you mention above; is it an option? There are a number of optimizations which have been proposed (related to boxing of floats, compilation strategy for let-binding on tuples, etc), which could reduce significantly the allocation rate of some programs. In my experience, this reduction can be observed on real-sized programs, but it does not translate to noticeable speedups. It might be the case that your program would benefit from such optimizations. Having access to the code would be very useful! Alain ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: AW: [Caml-list] What is triggering a lot of GC work? 2013-02-25 15:45 ` Alain Frisch @ 2013-02-25 16:26 ` Gerd Stolpmann 2013-02-25 16:32 ` Gabriel Scherer 1 sibling, 0 replies; 10+ messages in thread From: Gerd Stolpmann @ 2013-02-25 16:26 UTC (permalink / raw) To: Alain Frisch; +Cc: Francois Berenger, caml-list Am Montag, den 25.02.2013, 16:45 +0100 schrieb Alain Frisch: > On 02/25/2013 02:31 PM, Gerd Stolpmann wrote: > > This can have counter-intuitive consequences. Yesterday I sped an > > imperative program up by adding allocations! > > This is really an interesting scenario, thanks for sharing! > > Two other approaches to addressing the same performance issue could have > been: > > 1. increase the size of the minor heap so that your array stays in it > long enough; > > 2. try to reduce the number of other allocations. > > Did you try one of these approaches as well? (1 in particular is > particularly easy to test.) No, there was no chance of keeping this array in the minor heap otherwise, the program was running for too long. > Gabriel Scherer recently called the community to share representative > "benchmarks", in order to help core developers target optimization > efforts to where they are useful: > > http://gallium.inria.fr/~scherer/gagallium/we-need-a-representative-benchmark-suite/ > > Gabriel: except from LexiFi's contribution, did you get any code? Gerd: > it would be great if you could share the code you mention above; is it > an option? Unfortunately not - it's an interpreter I developed for my customer. I can try to create a synthetic demo case just to show the effect. (The array is in this program actually a kind of stack frame, and it is interpreting some data manipulation code. When executing a statement, the current data item is put into the first cell of the frame, so we have really a lot of assignments here. The data items are strings, and every data manipulation creates new strings, and this results in some allocation speed (but not really high, as e.g. in a term rewriter).) Gerd > There are a number of optimizations which have been proposed > (related to boxing of floats, compilation strategy for let-binding on > tuples, etc), which could reduce significantly the allocation rate of > some programs. In my experience, this reduction can be observed on > real-sized programs, but it does not translate to noticeable speedups. > It might be the case that your program would benefit from such > optimizations. Having access to the code would be very useful! > > > Alain > -- ------------------------------------------------------------ Gerd Stolpmann, Darmstadt, Germany gerd@gerd-stolpmann.de Creator of GODI and camlcity.org. Contact details: http://www.camlcity.org/contact.html Company homepage: http://www.gerd-stolpmann.de *** Searching for new projects! Need consulting for system *** programming in Ocaml? Gerd Stolpmann can help you. ------------------------------------------------------------ ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: AW: [Caml-list] What is triggering a lot of GC work? 2013-02-25 15:45 ` Alain Frisch 2013-02-25 16:26 ` Gerd Stolpmann @ 2013-02-25 16:32 ` Gabriel Scherer 2013-02-25 16:52 ` [Caml-list] OCaml benchmarks Török Edwin 1 sibling, 1 reply; 10+ messages in thread From: Gabriel Scherer @ 2013-02-25 16:32 UTC (permalink / raw) To: Alain Frisch; +Cc: Gerd Stolpmann, Francois Berenger, caml-list Thanks for the friendly poking. I did get some code (I've actually been surprised by how dedicated some submitters one, eg. Edwin Török), but my plate has been full non-stop since and I haven't yet taken the time to put this into shape. It's on my TODO list and I hope to share some results in the coming weeks. Regarding the interesting battle story from Gerd, my own idea was to "oldify" the values before inserting them in the array, in order not to fire the write barrier. Oldifying values is costly as well, so I'm not sure if that's interesting if the array is long-lived but the elements short-lived. And more importantly, the oldifying interface is, to my knowledge, not exposed to end-users (while it's possible through the C interface to allocate directly in the old region), so this cannot be written and tested without ugly hacks right now. I'd still be curious to know how this solution would compare to the others. On Mon, Feb 25, 2013 at 4:45 PM, Alain Frisch <alain.frisch@lexifi.com> wrote: > On 02/25/2013 02:31 PM, Gerd Stolpmann wrote: >> >> This can have counter-intuitive consequences. Yesterday I sped an >> imperative program up by adding allocations! > > > This is really an interesting scenario, thanks for sharing! > > Two other approaches to addressing the same performance issue could have > been: > > 1. increase the size of the minor heap so that your array stays in it long > enough; > > 2. try to reduce the number of other allocations. > > Did you try one of these approaches as well? (1 in particular is > particularly easy to test.) > > > > Gabriel Scherer recently called the community to share representative > "benchmarks", in order to help core developers target optimization efforts > to where they are useful: > > http://gallium.inria.fr/~scherer/gagallium/we-need-a-representative-benchmark-suite/ > > Gabriel: except from LexiFi's contribution, did you get any code? Gerd: it > would be great if you could share the code you mention above; is it an > option? There are a number of optimizations which have been proposed > (related to boxing of floats, compilation strategy for let-binding on > tuples, etc), which could reduce significantly the allocation rate of some > programs. In my experience, this reduction can be observed on real-sized > programs, but it does not translate to noticeable speedups. It might be the > case that your program would benefit from such optimizations. Having access > to the code would be very useful! > > > Alain > > > -- > Caml-list mailing list. Subscription management and archives: > https://sympa.inria.fr/sympa/arc/caml-list > Beginner's list: http://groups.yahoo.com/group/ocaml_beginners > Bug reports: http://caml.inria.fr/bin/caml-bugs ^ permalink raw reply [flat|nested] 10+ messages in thread
* [Caml-list] OCaml benchmarks 2013-02-25 16:32 ` Gabriel Scherer @ 2013-02-25 16:52 ` Török Edwin 0 siblings, 0 replies; 10+ messages in thread From: Török Edwin @ 2013-02-25 16:52 UTC (permalink / raw) To: caml-list On 02/25/2013 06:32 PM, Gabriel Scherer wrote: > Thanks for the friendly poking. I did get some code (I've actually > been surprised by how dedicated some submitters one, eg. Edwin Török), > but my plate has been full non-stop since and I haven't yet taken the > time to put this into shape. It's on my TODO list and I hope to share > some results in the coming weeks. Thanks, its not yet finished though: I meant to add a benchmark for ocaml-re too and then publish it. I got sidetracked trying to find some meaningful way to easily represent the results though (the text output is a bit too verbose). But since you brought it up I'd like your opinion on plots: Currently I'm thinking of generating from the .csv: - one SVG boxplot for (weighted) median/mean of OCaml version X vs Y performance - one SVG paired barplot with confidence intervals for the individual benchmarks - instead of X-axis labels have on-mouse-over tooltips (SVG title element) describing benchmark name and time statistics Initially I tried boxplots for the individual measurements (using PNG/PDF output of archimedes), but the graphs either looked too crowded (not enough room to place all labels), or there were too many graphs and hard to get an overall picture (if I put fewer benchmarks/page). Best regards, --Edwin ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2013-02-26 4:30 UTC | newest] Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2013-02-25 2:08 [Caml-list] What is triggering a lot of GC work? Francois Berenger 2013-02-25 8:02 ` Mark Shinwell 2013-02-25 10:32 ` ygrek 2013-02-26 3:46 ` Francois Berenger 2013-02-26 4:29 ` ygrek 2013-02-25 13:31 ` AW: " Gerd Stolpmann 2013-02-25 15:45 ` Alain Frisch 2013-02-25 16:26 ` Gerd Stolpmann 2013-02-25 16:32 ` Gabriel Scherer 2013-02-25 16:52 ` [Caml-list] OCaml benchmarks Török Edwin
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox