Hi,

Thanks for your suggestions. I tried to patch lablgtk2 with:

--- src/ml_gdkpixbuf.c.orig     2012-04-03 13:56:29.618264702 +0200
+++ src/ml_gdkpixbuf.c 2012-04-03 13:56:58.106263510 +0200
@@ -119,7 +119,7 @@
   value ret;
   if (pb == NULL) ml_raise_null_pointer();
   ret = alloc_custom (&ml_custom_GdkPixbuf, sizeof pb,
-                     100, 1000);
+                     0, 1);
   p = Data_custom_val (ret);
   *p = ref ? g_object_ref (pb) : pb;
   return ret;

--- src/ml_gobject.c.orig       2012-04-03 15:40:11.002004506 +0200
+++ src/ml_gobject.c    2012-04-03 15:41:04.938002250 +0200
@@ -219,7 +219,7 @@
CAMLprim value ml_g_value_new(void)
{
     value ret = alloc_custom(&ml_custom_GValue, sizeof(value)+sizeof(GValue),
-                             20, 1000);
+                             0, 1);
     /* create an MLPointer */
     Field(ret,1) = (value)2;
     ((GValue*)&Field(ret,2))->g_type = 0;
@@ -272,14 +272,14 @@
   custom_serialize_default, custom_deserialize_default };
CAMLprim value Val_gboxed(GType t, gpointer p)
{
-    value ret = alloc_custom(&ml_custom_gboxed, 2*sizeof(value), 10, 1000);
+    value ret = alloc_custom(&ml_custom_gboxed, 2*sizeof(value), 0, 1);
     Store_pointer(ret, g_boxed_copy (t,p));
     Field(ret,2) = (value)t;
     return ret;
}
CAMLprim value Val_gboxed_new(GType t, gpointer p)
{
-    value ret = alloc_custom(&ml_custom_gboxed, 2*sizeof(value), 10, 1000);
+    value ret = alloc_custom(&ml_custom_gboxed, 2*sizeof(value), 0, 1);
     Store_pointer(ret, p);
     Field(ret,2) = (value)t;
     return ret;

At startup is uses
top - 16:40:27 up 1 day, 7:01, 28 users, load average: 0.47, 0.50, 0.35
Tasks:   1 total,   0 running,   1 sleeping,   0 stopped,   0 zombie
Cpu(s): 4.8%us, 1.3%sy, 0.0%ni, 93.6%id, 0.2%wa, 0.0%hi, 0.1%si, 0.0%st
Mem:   4004736k total, 3617960k used,   386776k free,   130704k buffers
Swap: 4070396k total,     9244k used, 4061152k free, 1730344k cached

PID USER      PR NI VIRT RES SHR S %CPU %MEM    TIME+ COMMAND
10275 hans      20   0 529m 77m 13m S   14 2.0   0:01.66 vc_client.nativ

and 12 hours later
top - 04:40:07 up 1 day, 19:01, 35 users, load average: 0.00, 0.01, 0.05
Tasks:   1 total,   0 running,   1 sleeping,   0 stopped,   0 zombie
Cpu(s): 20.2%us, 3.4%sy, 0.0%ni, 76.1%id, 0.1%wa, 0.0%hi, 0.2%si, 0.0%st
Mem:   4004736k total, 3828308k used,   176428k free,   143928k buffers
Swap: 4070396k total,    10708k used, 4059688k free, 1756524k cached

PID USER      PR NI VIRT RES SHR S %CPU %MEM    TIME+ COMMAND
10275 hans      20   0 534m 82m 13m S   17 2.1 110:11.19 vc_client.nativ

Without the patch
top - 22:05:38 up 1 day, 12:26, 34 users, load average: 0.35, 0.16, 0.13
Tasks:   1 total,   0 running,   1 sleeping,   0 stopped,   0 zombie
Cpu(s): 5.6%us, 1.5%sy, 0.0%ni, 92.6%id, 0.2%wa, 0.0%hi, 0.1%si, 0.0%st
Mem:   4004736k total, 3868136k used,   136600k free,   140900k buffers
Swap: 4070396k total,     9680k used, 4060716k free, 1837500k cached

PID USER      PR NI VIRT RES SHR S %CPU %MEM    TIME+ COMMAND
25111 hans      20   0 453m 76m 13m S   14 2.0   0:13.68 vc_client_old.n

top - 10:05:19 up 2 days, 26 min, 35 users, load average: 0.01, 0.04, 0.05
Tasks:   1 total,   0 running,   1 sleeping,   0 stopped,   0 zombie
Cpu(s): 20.4%us, 3.2%sy, 0.0%ni, 75.8%id, 0.4%wa, 0.0%hi, 0.2%si, 0.0%st
Mem:   4004736k total, 3830596k used,   174140k free,   261692k buffers
Swap: 4070396k total,    13640k used, 4056756k free, 1640452k cached

PID USER      PR NI VIRT RES SHR S %CPU %MEM    TIME+ COMMAND
25111 hans      20   0 453m 76m 13m S   49 2.0 263:05.34 vc_client_old.n

So from this it seems that with the patch it still uses more and more CPU, but at a much lower rate. However, it seems to increase memory usage with the patch. I also tried to patch the wrappers.h file, but the memory consumption just exploded.

So it is working better, but still not good enough. Is there some way to prevent this kind of behavior? That is, no extra memory usage and no extra CPU usage.

I have attached some additional profiling if that would be of any interest. In short it seems to be that it is the GC that is consuming the CPU.

Best,

Hans Ole

On Tue, Apr 3, 2012 at 2:13 PM, Jerome Vouillon <vouillon@pps.jussieu.fr> wrote:

On Tue, Apr 03, 2012 at 12:42:08PM +0200, Gerd Stolpmann wrote:
> This reminds me of a problem I had with a specific C binding (for mysql),
> years ago. That binding allocated custom blocks with badly chosen
> parameters used/max (see the docs for caml_alloc_custom in
> http://caml.inria.fr/pub/docs/manual-ocaml/manual032.html#toc144). If the
> ratio used/max is > 0, these parameters accelerate the GC. If the custom
> blocks are frequently allocated, this can have a dramatic effect, even for
> quite small used/max ratios. The solution was to change the code, and to
> set used=0 and max=1.
>
> This type of problem would match your observation that the GC works more
> and more the longer the program runs, i.e. the more custom blocks have
> been allocated.
>
> The problem basically also exists with bigarrays - with
> used=<size_of_bigarary> and max=256M (hardcoded).

I have also observed this with input-output channels (in_channel and
out_channel), where used = 1 and max = 1000. A full major GC is
performed every time a thousand files are opened, which can result on
a significant overhead when you open lot of files and the heap is
large.

-- Jerome