From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail1-relais-roc.national.inria.fr (mail1-relais-roc.national.inria.fr [192.134.164.82]) by yquem.inria.fr (Postfix) with ESMTP id A9A2ABC57 for ; Wed, 3 Mar 2010 01:55:25 +0100 (CET) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: An8CAKpCjUvRVdnYkGdsb2JhbACPMIN/h1UIFQEBAQEJCQwHEwMfsClChFgiiFUBAQMFglWCIQSDF4kD X-IronPort-AV: E=Sophos;i="4.49,570,1262559600"; d="scan'208";a="53911974" Received: from mail-gx0-f216.google.com ([209.85.217.216]) by mail1-smtp-roc.national.inria.fr with ESMTP; 03 Mar 2010 01:55:25 +0100 Received: by gxk8 with SMTP id 8so255842gxk.9 for ; Tue, 02 Mar 2010 16:55:24 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:sender:received:in-reply-to :references:date:x-google-sender-auth:message-id:subject:from:to:cc :content-type; bh=63VUyFWnJgHbKCuK/qC7Xx7QsLEpr8NHwrU1wuJErEg=; b=AIa6+0bs5K0iu5zErN6MFhYUyjxuhY4bB784n98GeSzS08da6gVJcxEp2zgVvgSdtp +dXn9DCiepHWG3hILwJSChVPJwZArGwudEV8ZMVXBpyPHXZGvYIf8IwbXkamIAKAecwn l3TfiGmtscyX1zQ1M562ksfNLnUHyGNJaesiE= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type; b=rRKyUbzVptnE+Q2ATe6+w1599fiWmob5cZC2P+hrZSm6QJS/iwQQDSyHXbwp89zPKs 42bp5S2gMRcu1d5zzYhLLgJrHYja+IA0AyGO7TckaxJtDJcq/FLR9Fgp+roW1mr+WaQy 32ErC04hEklim6mo9D8GTOUmScetfdYH0irjo= MIME-Version: 1.0 Sender: peter.hawkins@gmail.com Received: by 10.101.108.6 with SMTP id k6mr589277anm.53.1267577723657; Tue, 02 Mar 2010 16:55:23 -0800 (PST) In-Reply-To: <401ACAED-B7C3-4E3E-862E-045DEBC2A42B@gmail.com> References: <37150F07-8902-464A-9A0E-44A0C424C87B@gmail.com> <20100301085430.GA16106@annexia.org> <5043CED0-7CEC-4813-8325-76C3196E8FEE@gmail.com> <401ACAED-B7C3-4E3E-862E-045DEBC2A42B@gmail.com> Date: Tue, 2 Mar 2010 16:55:23 -0800 X-Google-Sender-Auth: 85283a802f5425aa Message-ID: Subject: Re: [Caml-list] gc overhead From: Peter Hawkins To: Warren Harris Cc: OCaml Content-Type: text/plain; charset=ISO-8859-1 X-Spam: no; 0.00; ocaml:01 ocaml's:01 ocaml's:01 invocation:01 ocaml:01 avoided:01 runtime:01 warren:98 callers:98 wrote:01 stack:01 graph:01 graph:01 caml-list:01 functions:01 Hi... On Tue, Mar 2, 2010 at 3:08 PM, Warren Harris wrote: > > Peter - gprof with ocaml works quite well: > http://caml.inria.fr/pub/docs/manual-ocaml/manual031.html I'm fully aware of gprof and ocaml's support of profiling. OCaml's profiling support works by adding calls to the _mcount library function at the entry point to every compiled function, which takes approximately 10 instructions on x86 (pushes and pops to save registers, and a call instruction). The _mcount function records function call counts, and is also responsible for producing the call graph. Separately, the profile library samples the program counter at some frequency, which lets us work out in which functions the program is spending its time. Using OCaml's profiling support has three problems: 1) programs compiled with profiling are slower, and 2) the profiling instrumentation itself distorts the resulting profile, and 3) the call graph accounting is inaccurate. Let's discuss each of these in turn: Problem (1) is simply that your program has extra overhead from all of those _mcount calls, which occur on every function invocation. You can't turn them off, and you can't make them happen less frequently. It's an all-or-nothing proposition. It would be unusual to include profiling instrumentation in a production system. Problem (2) is a little more subtle. Recall that the profiling instrumentation adds ~10 instructions to the start of each function, regardless of its size. For a large function, this may be a negligible overhead. For a small function, say one that was only 5 or 10 instructions in size to begin with, that is a substantial overhead. Since we determine how much time is spent in each function by sampling the program counter, small and frequently called functions will appear to take relatively longer than larger functions in the resulting profile. Small functions are common in OCaml code so we should see an appreciable amount of distortion. Problem (3) is a criticism of the _mcount mechanism in general. For each function f(), the profiler knows (a) how long we spent executing f() in total, and (b) how many times each of f()'s callers invoked f(). We do not know how much time f() spent executing on behalf of any given caller. If we assume that all of f()'s invocations took approximately the same amount of time, then we can use the caller counts to approximate the time spent executing f() on behalf of each caller. However, the assumption that f() always takes approximately the same amount of time is not necessarily a good one. I think it's an especially bad assumption in a functional program. These problems are avoided by using a sampling profiler like oprofile or shark, which samples an _uninstrumented_ binary at a particular frequency. Because the binary is unmodified, we can turn profiling on and off on a running system, avoiding point (1); furthermore we can adjust the sampling rate so profiling overhead is low enough to be tolerable. Since there is no instrumentation added to the program, the resulting profile does not suffer from the distortion of point (2). Some profilers (e.g. shark on Mac OS X) can deal with point (3) as well --- all we need to do is record a complete stack trace at sampling time. My point was that oprofile or one of its cousins (e.g. shark) is probably adequate for your needs. You can set the sampling rate low enough that your service can run more or less as normal. To determine GC overhead, you simply need to look at the total amount of time spent in the various GC functions of the runtime. Peter