From: Hans Ole Rafaelsen <hrafaelsen@gmail.com>
To: Jerome Vouillon <vouillon@pps.jussieu.fr>
Cc: Gerd Stolpmann <info@gerd-stolpmann.de>,
"Richard W.M. Jones" <rich@annexia.org>,
caml-list@inria.fr
Subject: Re: [Caml-list] Strategies for finding memory leaks
Date: Wed, 4 Apr 2012 12:53:21 +0200 [thread overview]
Message-ID: <CALs4vDYYAjv0RN0kFx8wPqNzFW4t8LZNfaqE0-B5pzS69xB02g@mail.gmail.com> (raw)
In-Reply-To: <20120403121327.GA29159@pps.jussieu.fr>
[-- Attachment #1.1: Type: text/plain, Size: 5561 bytes --]
Hi,
Thanks for your suggestions. I tried to patch lablgtk2 with:
--- src/ml_gdkpixbuf.c.orig 2012-04-03 13:56:29.618264702 +0200
+++ src/ml_gdkpixbuf.c 2012-04-03 13:56:58.106263510 +0200
@@ -119,7 +119,7 @@
value ret;
if (pb == NULL) ml_raise_null_pointer();
ret = alloc_custom (&ml_custom_GdkPixbuf, sizeof pb,
- 100, 1000);
+ 0, 1);
p = Data_custom_val (ret);
*p = ref ? g_object_ref (pb) : pb;
return ret;
--- src/ml_gobject.c.orig 2012-04-03 15:40:11.002004506 +0200
+++ src/ml_gobject.c 2012-04-03 15:41:04.938002250 +0200
@@ -219,7 +219,7 @@
CAMLprim value ml_g_value_new(void)
{
value ret = alloc_custom(&ml_custom_GValue,
sizeof(value)+sizeof(GValue),
- 20, 1000);
+ 0, 1);
/* create an MLPointer */
Field(ret,1) = (value)2;
((GValue*)&Field(ret,2))->g_type = 0;
@@ -272,14 +272,14 @@
custom_serialize_default, custom_deserialize_default };
CAMLprim value Val_gboxed(GType t, gpointer p)
{
- value ret = alloc_custom(&ml_custom_gboxed, 2*sizeof(value), 10, 1000);
+ value ret = alloc_custom(&ml_custom_gboxed, 2*sizeof(value), 0, 1);
Store_pointer(ret, g_boxed_copy (t,p));
Field(ret,2) = (value)t;
return ret;
}
CAMLprim value Val_gboxed_new(GType t, gpointer p)
{
- value ret = alloc_custom(&ml_custom_gboxed, 2*sizeof(value), 10, 1000);
+ value ret = alloc_custom(&ml_custom_gboxed, 2*sizeof(value), 0, 1);
Store_pointer(ret, p);
Field(ret,2) = (value)t;
return ret;
At startup is uses
top - 16:40:27 up 1 day, 7:01, 28 users, load average: 0.47, 0.50, 0.35
Tasks: 1 total, 0 running, 1 sleeping, 0 stopped, 0 zombie
Cpu(s): 4.8%us, 1.3%sy, 0.0%ni, 93.6%id, 0.2%wa, 0.0%hi, 0.1%si,
0.0%st
Mem: 4004736k total, 3617960k used, 386776k free, 130704k buffers
Swap: 4070396k total, 9244k used, 4061152k free, 1730344k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+
COMMAND
10275 hans 20 0 529m 77m 13m S 14 2.0 0:01.66
vc_client.nativ
and 12 hours later
top - 04:40:07 up 1 day, 19:01, 35 users, load average: 0.00, 0.01, 0.05
Tasks: 1 total, 0 running, 1 sleeping, 0 stopped, 0 zombie
Cpu(s): 20.2%us, 3.4%sy, 0.0%ni, 76.1%id, 0.1%wa, 0.0%hi, 0.2%si,
0.0%st
Mem: 4004736k total, 3828308k used, 176428k free, 143928k buffers
Swap: 4070396k total, 10708k used, 4059688k free, 1756524k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+
COMMAND
10275 hans 20 0 534m 82m 13m S 17 2.1 110:11.19
vc_client.nativ
Without the patch
top - 22:05:38 up 1 day, 12:26, 34 users, load average: 0.35, 0.16, 0.13
Tasks: 1 total, 0 running, 1 sleeping, 0 stopped, 0 zombie
Cpu(s): 5.6%us, 1.5%sy, 0.0%ni, 92.6%id, 0.2%wa, 0.0%hi, 0.1%si,
0.0%st
Mem: 4004736k total, 3868136k used, 136600k free, 140900k buffers
Swap: 4070396k total, 9680k used, 4060716k free, 1837500k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+
COMMAND
25111 hans 20 0 453m 76m 13m S 14 2.0 0:13.68
vc_client_old.n
top - 10:05:19 up 2 days, 26 min, 35 users, load average: 0.01, 0.04, 0.05
Tasks: 1 total, 0 running, 1 sleeping, 0 stopped, 0 zombie
Cpu(s): 20.4%us, 3.2%sy, 0.0%ni, 75.8%id, 0.4%wa, 0.0%hi, 0.2%si,
0.0%st
Mem: 4004736k total, 3830596k used, 174140k free, 261692k buffers
Swap: 4070396k total, 13640k used, 4056756k free, 1640452k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+
COMMAND
25111 hans 20 0 453m 76m 13m S 49 2.0 263:05.34
vc_client_old.n
So from this it seems that with the patch it still uses more and more CPU,
but at a much lower rate. However, it seems to increase memory usage with
the patch. I also tried to patch the wrappers.h file, but the memory
consumption just exploded.
So it is working better, but still not good enough. Is there some way to
prevent this kind of behavior? That is, no extra memory usage and no extra
CPU usage.
I have attached some additional profiling if that would be of any interest.
In short it seems to be that it is the GC that is consuming the CPU.
Best,
Hans Ole
On Tue, Apr 3, 2012 at 2:13 PM, Jerome Vouillon <vouillon@pps.jussieu.fr>wrote:
> On Tue, Apr 03, 2012 at 12:42:08PM +0200, Gerd Stolpmann wrote:
> > This reminds me of a problem I had with a specific C binding (for mysql),
> > years ago. That binding allocated custom blocks with badly chosen
> > parameters used/max (see the docs for caml_alloc_custom in
> > http://caml.inria.fr/pub/docs/manual-ocaml/manual032.html#toc144). If
> the
> > ratio used/max is > 0, these parameters accelerate the GC. If the custom
> > blocks are frequently allocated, this can have a dramatic effect, even
> for
> > quite small used/max ratios. The solution was to change the code, and to
> > set used=0 and max=1.
> >
> > This type of problem would match your observation that the GC works more
> > and more the longer the program runs, i.e. the more custom blocks have
> > been allocated.
> >
> > The problem basically also exists with bigarrays - with
> > used=<size_of_bigarary> and max=256M (hardcoded).
>
> I have also observed this with input-output channels (in_channel and
> out_channel), where used = 1 and max = 1000. A full major GC is
> performed every time a thousand files are opened, which can result on
> a significant overhead when you open lot of files and the heap is
> large.
>
> -- Jerome
>
[-- Attachment #1.2: Type: text/html, Size: 6515 bytes --]
[-- Attachment #2: MemoryProfiling.txt --]
[-- Type: text/plain, Size: 13629 bytes --]
vc_client.native has run for about 19 hours
vc_client_old.native has for about 14 hours
Tasks: 272 total, 3 running, 269 sleeping, 0 stopped, 0 zombie
Cpu(s): 20.2%us, 3.2%sy, 0.0%ni, 76.3%id, 0.0%wa, 0.0%hi, 0.3%si, 0.0%st
Mem: 4004736k total, 3633672k used, 371064k free, 262916k buffers
Swap: 4070396k total, 14160k used, 4056236k free, 1417088k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
25111 hans 20 0 453m 76m 13m R 50 2.0 319:04.86 vc_client_old.n
10275 hans 20 0 553m 99m 13m S 19 2.5 188:30.50
vc_client.nativ
vc_client_old.native
CPU: Intel Sandy Bridge microarchitecture, speed 2.201e+06 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (No unit mask) count 100000
samples % image name symbol name
230093 51.8571 vc_client_old.native mark_slice
99659 22.4606 vc_client_old.native sweep_slice
76563 17.2553 vc_client_old.native caml_page_table_lookup
13215 2.9783 vc_client_old.native caml_fl_merge_block
1814 0.4088 vc_client_old.native truncate_flp
1013 0.2283 vc_client_old.native caml_gc_stat
784 0.1767 vc_client_old.native caml_iterate_global_roots
775 0.1747 vc_client_old.native caml_fl_allocate
564 0.1271 vc_client_old.native camlLividisvc__Nalbuf_tools__replace_pattern_1033
545 0.1228 vc_client_old.native caml_oldify_local_roots
503 0.1134 vc_client_old.native caml_modify
453 0.1021 vc_client_old.native camlLividisvc__Nalbuf_bytestream__search_1047
446 0.1005 vc_client_old.native caml_delete_global_root
412 0.0929 vc_client_old.native caml_do_roots
343 0.0773 vc_client_old.native caml_darken
306 0.0690 vc_client_old.native lwt_glib_get_sources
297 0.0669 vc_client_old.native caml_insert_global_root
250 0.0563 vc_client_old.native camlUq_gtk__fun_1484
213 0.0480 vc_client_old.native camlHashtbl__find_1093
205 0.0462 vc_client_old.native compare_val
195 0.0439 vc_client_old.native camlLwt_glib__enter_1079
191 0.0430 [vdso] (tgid:25111 range:0x7fff12dff000-0x7fff12e00000) [vdso] (tgid:25111 range:0x7fff12dff000-0x7fff12e00000)
190 0.0428 vc_client_old.native caml_stash_backtrace
188 0.0424 vc_client_old.native caml_oldify_one
185 0.0417 vc_client_old.native camlHashtbl__replace_1112
179 0.0403 vc_client_old.native caml_alloc_string
176 0.0397 vc_client_old.native camlMedia_ctrl__fun_3055
176 0.0397 vc_client_old.native caml_c_call
173 0.0390 vc_client_old.native camlList__iter_1074
171 0.0385 vc_client_old.native caml_apply2
164 0.0370 vc_client_old.native caml_obj_dup
159 0.0358 vc_client_old.native caml_send0
vc_client.native
CPU: Intel Sandy Bridge microarchitecture, speed 2.201e+06 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (No unit mask) count 100000
samples % image name symbol name
50295 45.3259 vc_client.native mark_slice
17964 16.1892 vc_client.native sweep_slice
14450 13.0224 vc_client.native caml_page_table_lookup
2137 1.9259 vc_client.native caml_fl_merge_block
1397 1.2590 vc_client.native caml_fl_allocate
1305 1.1761 vc_client.native caml_gc_stat
599 0.5398 vc_client.native camlLividisvc__Nalbuf_tools__replace_pattern_1033
546 0.4921 vc_client.native camlLividisvc__Nalbuf_bytestream__search_1047
544 0.4903 vc_client.native caml_modify
540 0.4866 vc_client.native caml_delete_global_root
422 0.3803 vc_client.native caml_iterate_global_roots
404 0.3641 vc_client.native truncate_flp
339 0.3055 vc_client.native camlUq_gtk__fun_1484
327 0.2947 vc_client.native camlHashtbl__find_1093
311 0.2803 vc_client.native caml_insert_global_root
290 0.2613 vc_client.native lwt_glib_get_sources
284 0.2559 vc_client.native caml_oldify_local_roots
260 0.2343 vc_client.native camlMedia_ctrl__fun_3055
260 0.2343 vc_client.native camlUnixqueue_select__fun_2366
248 0.2235 vc_client.native compare_val
239 0.2154 vc_client.native caml_stash_backtrace
220 0.1983 [vdso] (tgid:10275 range:0x7fffea534000-0x7fffea535000) [vdso] (tgid:10275 range:0x7fffea534000-0x7fffea535000)
219 0.1974 vc_client.native camlLwt_glib__enter_1079
206 0.1856 vc_client.native camlList__iter_1074
202 0.1820 vc_client.native caml_apply2
196 0.1766 vc_client.native caml_c_call
194 0.1748 vc_client.native caml_send0
180 0.1622 vc_client.native camlHashtbl__replace_1112
180 0.1622 vc_client.native caml_alloc_string
176 0.1586 vc_client.native caml_obj_dup
176 0.1586 vc_client.native caml_oldify_one
173 0.1559 vc_client.native caml_remove_generational_global_root
171 0.1541 vc_client.native caml_curry3_2
vc_client.native
CPU: Intel Sandy Bridge microarchitecture, speed 2.201e+06 MHz (estimated)
Counted LLC_MISSES events (Last level cache demand requests from this core that missed the LLC) with a unit
mask of 0x41 (No unit mask) count 100000
samples % image name symbol name
598 53.3452 vc_client.native mark_slice
261 23.2828 vc_client.native caml_page_table_lookup
23 2.0517 vc_client.native sweep_slice
13 1.1597 vc_client.native caml_fl_allocate
6 0.5352 vc_client.native camlUq_gtk__fun_1484
5 0.4460 vc_client.native caml_delete_global_root
4 0.3568 vc_client.native camlEqueue__run_1119
4 0.3568 vc_client.native camlHashtbl__find_1093
4 0.3568 vc_client.native camlLividisvc__Nalbuf_bytestream__search_1047
4 0.3568 vc_client.native camlMedia_ctrl__fun_3055
4 0.3568 vc_client.native camlUnixqueue_select__method_run_1662
4 0.3568 vc_client.native caml_callback_exn
4 0.3568 vc_client.native caml_iterate_global_roots
4 0.3568 vc_client.native caml_modify
3 0.2676 vc_client.native camlAudio_encoder__transfer_audio_frame_1052
3 0.2676 vc_client.native camlBitstring_structs__string_of_rtp_ext_1030
3 0.2676 vc_client.native camlHashtbl__replace_1112
3 0.2676 vc_client.native camlLwt_glib__leave_1085
3 0.2676 vc_client.native camlSequencer__serialize_header_2231
3 0.2676 vc_client.native camlUnixqueue_select__fun_2366
3 0.2676 vc_client.native camlUq_gtk__code_begin
3 0.2676 vc_client.native caml_apply2
3 0.2676 vc_client.native caml_curry3_2
3 0.2676 vc_client.native caml_gc_stat
3 0.2676 vc_client.native caml_hash_univ_param
3 0.2676 vc_client.native caml_send0
3 0.2676 vc_client.native compare_val
2 0.1784 vc_client.native camlBuffer__create_1039
2 0.1784 vc_client.native camlList__iter_1074
2 0.1784 vc_client.native camlLwt__wait_1290
2 0.1784 vc_client.native camlLwt_sequence__loop_1098
2 0.1784 vc_client.native camlMedia_ctrl__fun_3085
2 0.1784 vc_client.native camlMedia_ctrl__receiver_cbfun_2123
2 0.1784 vc_client.native camlPervasives__min_1022
2 0.1784 vc_client.native camlProfiles__fec_percentage_1232
2 0.1784 vc_client.native camlScanf__scan_1702
2 0.1784 vc_client.native camlScanf__scan_format_1685
2 0.1784 vc_client.native camlSequencer__my_push_2260
2 0.1784 vc_client.native camlUnixqueue_select__fun_1864
2 0.1784 vc_client.native camlUnixqueue_select__method_protect_1800
2 0.1784 vc_client.native camlUnixqueue_select__method_uq_handler_1726
2 0.1784 vc_client.native caml_curry4_1
2 0.1784 vc_client.native caml_insert_global_root
vc_client_old.native
CPU: Intel Sandy Bridge microarchitecture, speed 2.201e+06 MHz (estimated)
Counted LLC_MISSES events (Last level cache demand requests from this core that missed the LLC) with a unit mask of 0x41 (No unit mask) count 100000
samples % symbol name
2411 65.8563 mark_slice
923 25.2117 caml_page_table_lookup
101 2.7588 sweep_slice
10 0.2731 caml_iterate_global_roots
8 0.2185 caml_oldify_local_roots
7 0.1912 caml_fl_allocate
7 0.1912 caml_fl_merge_block
5 0.1366 camlHashtbl__replace_1112
5 0.1366 camlLividisvc__Nalbuf_bytestream__search_1047
5 0.1366 camlUq_gtk__fun_1484
5 0.1366 caml_delete_global_root
4 0.1093 camlLwt_glib__enter_1079
4 0.1093 caml_darken
4 0.1093 caml_send2
3 0.0819 camlBitstring___add_bits_1532
3 0.0819 camlEqueue__run_1119
3 0.0819 camlLwt_sequence__loop_1098
3 0.0819 camlSubstream_seqnum__code_begin
3 0.0819 caml_alloc_string
3 0.0819 compare_val
3 0.0819 lwt_glib_get_sources
2 0.0546 camlBitstring__add_byte_1515
2 0.0546 camlBuffer__create_1039
2 0.0546 camlGObj__method_draw_3325
2 0.0546 camlHashtbl__find_1093
2 0.0546 camlHashtbl__find_1198
2 0.0546 camlHashtbl__iter_1129
2 0.0546 camlList__iter_1074
2 0.0546 camlLividicore__Filter__eval_tree_1110
2 0.0546 camlMcsvc_types__add_header_1175
2 0.0546 camlMedia_ctrl__fun_3085
2 0.0546 camlPervasives__$5e_1112
2 0.0546 camlSubstream_seqnum__my_push_1197
2 0.0546 camlUnixqueue_select__fun_2314
2 0.0546 camlUnixqueue_util__dlogr_1060
2 0.0546 caml_apply2
2 0.0546 caml_apply3
2 0.0546 caml_c_call
2 0.0546 caml_curry3_1
2 0.0546 caml_do_roots
2 0.0546 caml_gc_stat
2 0.0546 caml_hash_univ_param
2 0.0546 caml_modify
2 0.0546 fdlist_to_fdset
at startup it looks like this:
top - 12:16:03 up 2 days, 2:37, 35 users, load average: 0.02, 0.02, 0.05
Tasks: 1 total, 0 running, 1 sleeping, 0 stopped, 0 zombie
Cpu(s): 19.7%us, 4.1%sy, 0.0%ni, 75.7%id, 0.2%wa, 0.0%hi, 0.3%si, 0.0%st
Mem: 4004736k total, 3634484k used, 370252k free, 263976k buffers
Swap: 4070396k total, 14160k used, 4056236k free, 1424216k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
12906 hans 20 0 453m 76m 13m S 14 2.0 0:14.62 vc_client_old.n
CPU: Intel Sandy Bridge microarchitecture, speed 2.201e+06 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (No unit mask) count 100000
samples % image name symbol name
12046 25.2045 vc_client_old.native mark_slice
8590 17.9733 vc_client_old.native caml_page_table_lookup
6311 13.2049 vc_client_old.native sweep_slice
718 1.5023 vc_client_old.native camlLividisvc__Nalbuf_tools__replace_pattern_1033
633 1.3245 vc_client_old.native caml_fl_allocate
536 1.1215 vc_client_old.native caml_modify
505 1.0566 vc_client_old.native caml_do_roots
484 1.0127 vc_client_old.native camlLividisvc__Nalbuf_bytestream__search_1047
483 1.0106 vc_client_old.native caml_iterate_global_roots
471 0.9855 vc_client_old.native caml_darken
412 0.8621 vc_client_old.native caml_delete_global_root
371 0.7763 vc_client_old.native caml_oldify_local_roots
323 0.6758 vc_client_old.native caml_insert_global_root
304 0.6361 vc_client_old.native caml_fl_merge_block
291 0.6089 vc_client_old.native lwt_glib_get_sources
266 0.5566 vc_client_old.native camlHashtbl__find_1093
216 0.4519 [vdso] (tgid:12906 range:0x7fffba5fc000-0x7fffba5fd000) [vdso] (tgid:12906 range:0x7fffba5fc000-0x7fffba5fd000)
216 0.4519 vc_client_old.native compare_val
215 0.4499 vc_client_old.native camlLwt_glib__enter_1079
205 0.4289 vc_client_old.native camlUq_gtk__fun_1484
202 0.4227 vc_client_old.native caml_alloc_string
200 0.4185 vc_client_old.native caml_c_call
193 0.4038 vc_client_old.native caml_oldify_one
192 0.4017 vc_client_old.native caml_stash_backtrace
187 0.3913 vc_client_old.native camlMedia_ctrl__fun_3055
next prev parent reply other threads:[~2012-04-04 10:53 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-03-21 9:49 Hans Ole Rafaelsen
2012-03-22 15:03 ` Goswin von Brederlow
2012-03-23 9:32 ` Hans Ole Rafaelsen
2012-03-24 14:00 ` Goswin von Brederlow
2012-04-01 19:57 ` Richard W.M. Jones
2012-04-02 8:15 ` Hans Ole Rafaelsen
2012-04-02 10:13 ` Richard W.M. Jones
2012-04-02 13:40 ` Hans Ole Rafaelsen
2012-04-02 11:26 ` John Carr
2012-04-03 10:42 ` Gerd Stolpmann
2012-04-03 12:13 ` Jerome Vouillon
2012-04-04 10:53 ` Hans Ole Rafaelsen [this message]
2012-04-04 11:30 ` Gabriel Scherer
2012-04-04 12:55 ` Gerd Stolpmann
2012-04-07 13:27 ` Hans Ole Rafaelsen
2012-04-04 13:55 ` [Caml-list] GC speed for custom blocks, was: " Gerd Stolpmann
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CALs4vDYYAjv0RN0kFx8wPqNzFW4t8LZNfaqE0-B5pzS69xB02g@mail.gmail.com \
--to=hrafaelsen@gmail.com \
--cc=caml-list@inria.fr \
--cc=info@gerd-stolpmann.de \
--cc=rich@annexia.org \
--cc=vouillon@pps.jussieu.fr \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox