Mailing list for all users of the OCaml language and system.
 help / color / mirror / Atom feed
From: Warren Harris <warrensomebody@gmail.com>
To: Peter Hawkins <hawkinsp@cs.stanford.edu>
Cc: OCaml <caml-list@inria.fr>
Subject: Re: [Caml-list] gc overhead
Date: Tue, 2 Mar 2010 23:06:32 -0800	[thread overview]
Message-ID: <613DC64E-A4C6-4C45-8A07-D59C1CAD5372@gmail.com> (raw)
In-Reply-To: <b0b348901003021655i436755e3obd2b8b55b320fa30@mail.gmail.com>

Peter,

Thanks, this is excellent info. I've been using both gprof and shark  
and understand the tradeoffs. I really was looking for a way to just  
provide a simple live "gc overhead" number that we could graph along  
with a bunch of other server health stats for our zenoss monitors.  
Looks like I'd need to hack my runtime a bit to get this though.

Warren

On Mar 2, 2010, at 4:55 PM, Peter Hawkins wrote:

> Hi...
>
> On Tue, Mar 2, 2010 at 3:08 PM, Warren Harris <warrensomebody@gmail.com 
> > wrote:
>>
>> Peter - gprof with ocaml works quite well:
>> http://caml.inria.fr/pub/docs/manual-ocaml/manual031.html
>
> I'm fully aware of gprof and ocaml's support of profiling.
>
> OCaml's profiling support works by adding calls to the _mcount library
> function at the entry point to every compiled function, which takes
> approximately 10 instructions on x86 (pushes and pops to save
> registers, and a call instruction). The _mcount function records
> function call counts, and is also responsible for producing the call
> graph. Separately, the profile library samples the program counter at
> some frequency, which lets us work out in which functions the program
> is spending its time.
>
> Using OCaml's profiling support has three problems:
> 1) programs compiled with profiling are slower, and
> 2) the profiling instrumentation itself distorts the resulting  
> profile, and
> 3) the call graph accounting is inaccurate.
>
> Let's discuss each of these in turn:
>
> Problem (1) is simply that your program has extra overhead from all of
> those _mcount calls, which occur on every function invocation. You
> can't turn them off, and you can't make them happen less frequently.
> It's an all-or-nothing proposition. It would be unusual to include
> profiling instrumentation in a production system.
>
> Problem (2) is a little more subtle. Recall that the profiling
> instrumentation adds ~10 instructions to the start of each function,
> regardless of its size. For a large function, this may be a negligible
> overhead. For a small function, say one that was only 5 or 10
> instructions in size to begin with, that is a substantial overhead.
> Since we determine how much time is spent in each function by sampling
> the program counter, small and frequently called functions will appear
> to take relatively longer than larger functions in the resulting
> profile. Small functions are common in OCaml code so we should see an
> appreciable amount of distortion.
>
> Problem (3) is a criticism of the _mcount mechanism in general. For
> each function f(), the profiler knows (a) how long we spent executing
> f() in total, and (b) how many times each of f()'s callers invoked
> f(). We do not know how much time f() spent executing on behalf of any
> given caller. If we assume that all of f()'s invocations took
> approximately the same amount of time, then we can use the caller
> counts to approximate the time spent executing f() on behalf of each
> caller. However, the assumption that f() always takes approximately
> the same amount of time is not necessarily a good one. I think it's an
> especially bad assumption in a functional program.
>
> These problems are avoided by using a sampling profiler like oprofile
> or shark, which samples an _uninstrumented_ binary at  a particular
> frequency. Because the binary is unmodified, we can turn profiling on
> and off on a running system, avoiding point (1); furthermore we can
> adjust the sampling rate so profiling overhead is low enough to be
> tolerable. Since there is no instrumentation added to the program, the
> resulting profile does not suffer from the distortion of point (2).
> Some profilers (e.g. shark on Mac OS X) can deal with point (3) as
> well --- all we need to do is record a complete stack trace at
> sampling time.
>
> My point was that oprofile or one of its cousins (e.g. shark) is
> probably adequate for your needs. You can set the sampling rate low
> enough that your service can run more or less as normal. To determine
> GC overhead, you simply need to look at the total amount of time spent
> in the various GC functions of the runtime.
>
> Peter


  reply	other threads:[~2010-03-03  7:06 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-03-01  0:16 Warren Harris
2010-03-01  8:54 ` [Caml-list] " Richard Jones
2010-03-02 20:11   ` Warren Harris
2010-03-02 21:01     ` Peter Hawkins
2010-03-02 23:08       ` Warren Harris
2010-03-03  0:55         ` Peter Hawkins
2010-03-03  7:06           ` Warren Harris [this message]
2010-03-03  8:33             ` David MENTRE
2010-03-02 22:03 ` Sylvain Le Gall
2010-03-02 23:09   ` [Caml-list] " Warren Harris
2010-03-03  1:58     ` Edgar Friendly
2010-03-03  8:12       ` Sylvain Le Gall
2010-03-03  8:49 ` [Caml-list] " Olivier Andrieu
2010-03-03 11:11 ` Goswin von Brederlow
2017-11-28 18:11 Julia Lawall
2017-11-28 20:05 ` Yawar Amin
2017-11-29  6:20   ` Julia Lawall
2017-12-11 18:53   ` Julia Lawall
2017-12-11 20:56     ` Gabriel Scherer
2017-12-11 21:05       ` Julia Lawall

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=613DC64E-A4C6-4C45-8A07-D59C1CAD5372@gmail.com \
    --to=warrensomebody@gmail.com \
    --cc=caml-list@inria.fr \
    --cc=hawkinsp@cs.stanford.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox