From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on yquem.inria.fr X-Spam-Level: X-Spam-Status: No, score=0.0 required=5.0 tests=none autolearn=disabled version=3.1.3 Received: from mail2-relais-roc.national.inria.fr (mail2-relais-roc.national.inria.fr [192.134.164.83]) by yquem.inria.fr (Postfix) with ESMTP id 4F0D2BC6D for ; Wed, 23 Jan 2008 13:08:35 +0100 (CET) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AgAAAGe9lkeDrhCRh2dsb2JhbACQFwEBAQgKKZ0D X-IronPort-AV: E=Sophos;i="4.25,238,1199660400"; d="scan'208";a="6470569" Received: from smeltpunt.science.ru.nl ([131.174.16.145]) by mail2-smtp-roc.national.inria.fr with ESMTP; 23 Jan 2008 13:08:35 +0100 Received: from tandem.cs.ru.nl (tandem.cs.ru.nl [131.174.142.18]) by smeltpunt.science.ru.nl (8.13.7/5.23) with ESMTP id m0NC8Vqk014338 for ; Wed, 23 Jan 2008 13:08:31 +0100 (MET) Received: from tews by tandem.cs.ru.nl with local (Exim 4.63) (envelope-from ) id 1JHeP9-0003VZ-Oa for caml-list@yquem.inria.fr; Wed, 23 Jan 2008 13:08:31 +0100 From: Hendrik Tews To: caml-list@yquem.inria.fr Subject: Re: [Caml-list] Questions on replacing finalizers and memory footprints References: <4757D904.2090502@functionality.de> <475909E8.7010301@inria.fr> <4759241C.4070202@lri.fr> Date: Wed, 23 Jan 2008 13:08:09 +0100 In-Reply-To: <4759241C.4070202@lri.fr> (Jean-Christophe =?iso-8859-1?Q?Fil?= =?iso-8859-1?Q?li=E2tre's?= message of "Fri, 07 Dec 2007 11:44:44 +0100") Message-ID: User-Agent: Gnus/5.110006 (No Gnus v0.6) Emacs/21.4 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=latin-1 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 2.63 on 131.174.16.145 X-Spam: no; 0.00; hendrik:01 tews:01 tews:01 finalizers:01 footprints:01 filliatre:01 filliatre:01 lri:01 lri:01 filliatr:01 hash:01 hash:01 runtime:01 hendrik:01 equality:01 Jean-Christophe Filli=E2tre writes: > Jean-Christophe Filli=E2tre's "size" library does exactly this: >=20 > http://www.lri.fr/~filliatr/software.en.html Indeed. However, note that it uses internally a hash table to store blocks already considered (in order to correctly account for sharing), and thus it is potentially incorrect if the GC moves some blocks during the count, for instance during a resizing of the hash table (which triggers the GC). I don't know how to avoid this issue; any help is welc= ome. Use the normal hash function (ie. the hash depends on the block and not on its address) and use the blocks themselves as keys! Then the GC will not change the hash and will update the keys in the hash table. Of course all in a hash table with physical equality. I use this solution in memcheck (http://www.cs.ru.nl/~tews/memcheck/). The problem is that your runtime will be quadratic in the number of blocks that are equal (=3D) but different (not =3D=3D). For my applications of memcheck this is problematic, because the data contains lots of (ref None) blocks, which cannot be identified for obvious reasons. The (apparently abondoned) ocaml-ty branch (http://www.pps.jussieu.fr/~henry/marshal/) uses a list instead of a hash table and is therefore quadratic in _all_ blocks. The performence is only sufficient for toy applications (4 hours for checking 4MB). For applications like size or memcheck it would be nice if the GC could notify the program when it moves blocks. Bye, Hendrik (Better answering late than never...)