From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail4-relais-sop.national.inria.fr (mail4-relais-sop.national.inria.fr [192.134.164.105]) by walapai.inria.fr (8.13.6/8.13.6) with ESMTP id p5DDQC5g022789 for ; Mon, 13 Jun 2011 15:26:12 +0200 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AswJALEO9k3RVdg2kGdsb2JhbAA8AQMShEmNRJQhCBQBAQEBCQkNBxQEIawYi3uCf4N6iR4BAQMGgSWDb4EKBIpthkeLXzyBMoIS X-IronPort-AV: E=Sophos;i="4.65,358,1304287200"; d="scan'208";a="101174747" Received: from mail-qw0-f54.google.com ([209.85.216.54]) by mail4-smtp-sop.national.inria.fr with ESMTP/TLS/RC4-SHA; 13 Jun 2011 15:26:06 +0200 Received: by qwc9 with SMTP id 9so3716657qwc.27 for ; Mon, 13 Jun 2011 06:26:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=yeD/4bjujnvUv43BQ4P7ViimeicVem6/q/sxuGoo2Xc=; b=C9J+Cr64djLoFuuASe0vDWezNwNhdu+Gz1a8QMuxT/Jg5FsLvSYy+iSH387HnUQj57 D8K2zjGNVBynSbSx1qHsvkIjP79HSWDclGbKNvInO3im0KBSoT6PPobLhfQAI4ah71dS 7xVhPImLzJtCGwlr4Lvp4Nkz4mIDG7cYCjfYk= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding; b=eq3oJssTs/twBQwOnAyf/l2CBKNS1M1T+qB2Z+4AzQKvwSECZ1R1bmz+7NUUN5AWmz zBQnv8lQTfq1jko4c9I/VxJbc8n4v8oxsLj/PbC5H7sqRLDF0LaF8RXQzUq+WqYLI0QI jkfQTKkvDzOKO7+SFqufSSIOeoEPDouwDT7jc= MIME-Version: 1.0 Received: by 10.229.88.144 with SMTP id a16mr3746777qcm.45.1307971565557; Mon, 13 Jun 2011 06:26:05 -0700 (PDT) Sender: philippe.wang.lists@gmail.com Received: by 10.229.182.6 with HTTP; Mon, 13 Jun 2011 06:26:05 -0700 (PDT) In-Reply-To: <20110613121936.GA637@localhost> References: <4DF55B5D.9090302@ropas.snu.ac.kr> <20110613143407.0fe166ce.ygrekheretix@gmail.com> <20110613121936.GA637@localhost> Date: Mon, 13 Jun 2011 15:26:05 +0200 X-Google-Sender-Auth: BTdZkLnMONFQYaErNgDtQyssLRA Message-ID: From: Philippe Wang To: Guillaume Yziquel Cc: ygrek , caml-list@inria.fr Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by walapai.inria.fr id p5DDQC5g022789 Subject: Re: [Caml-list] A question about GC. On Mon, Jun 13, 2011 at 2:19 PM, Guillaume Yziquel wrote: > Le Monday 13 Jun 2011 ą 14:34:07 (+0300), ygrek a écrit : >> On Mon, 13 Jun 2011 12:37:46 +0200 >> Philippe Wang wrote: >> >> > As far as I know, when OCaml's GC is not working as expected, it's >> > (almost) always because there are blocks allocated outside OCaml's >> > heap (a.k.a. custom blocks). >> >> Custom blocks are allocated on ocaml heap. > > No, not always. Nothing stops you from putting custom blocks outside of > the ocaml heap. It might even sometimes make sense. Though usually > you're fine allocating custom blocks on the ocaml heap with a pointer to > out of heap data. > >> > How big are custom blocks? The bigger they are, the worst the behavior >> > tends to be. >> >> Why? > > Because the GC relies on the two last arguments of caml_alloc_custom to > properly evaluate the memory impact of allocating a custom block > together with the data it refers to. It's easy to get it wrong for large > data, and thus miscalibrate the GC on these blocks. > > However, the fact that grey values seem to accumulate doesn't seem to > fit with such an simple explanation. But maybe Philippe is thinking > about something else when talking about "big custom blocks allocated > outside of the ocaml heap". Let's consider this simple scenario: let tmp = new_very_large_data_outside_ocaml_heap () in let result = some_stuff(tmp) in result => if tmp is not useful anymore, you will probably have to wait a long time before tmp is fully collected by the GC. That's because finalize functions are not called very often (as far as I remember, so tell me if I'm wrong) notably not during a minor collection, while finalize functions are those which actually free those large data allocated outside OCaml's heap. So even if a minor collection (or several) is (are) triggered, memory is not collected until a major collection. This should explain why in some scenarios, even if C-interface is written "perfectly", there is some "kind of memory-leak" (which is not actual memory leak but just some delay on deallocations, which might cause the program to fail). The only way I know to deallocate quicker custom data is to trigger the major collection more often. The only way to trigger the major collection more often without calling Gc.something(), is to write the program differently with a lot of small allocations ("a lot" means "enough", here). Two ways: allocate dummy data (this is not really good), allocate useful data (this is better). Allocating more data can happen when programming with threads! When some computation is done outside OCaml's heap, it's particularly relevant to use OCaml threads. But this means at least one thread is computing "normal OCaml". N.B. Well, maybe I'm not addressing the problem at all. Actually, what I'm saying here is mostly based on my experiment with mlgmp (an interface to use GMP in OCaml) about 5 years ago. -- Philippe Wang    mail@philippewang.info