From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on yquem.inria.fr X-Spam-Level: * X-Spam-Status: No, score=1.0 required=5.0 tests=AWL,SPF_NEUTRAL autolearn=disabled version=3.1.3 Received: from mail4-relais-sop.national.inria.fr (mail4-relais-sop.national.inria.fr [192.134.164.105]) by yquem.inria.fr (Postfix) with ESMTP id 28AC9BBCB for ; Fri, 7 Mar 2008 00:51:13 +0100 (CET) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: An0DAG8S0EfAXQImh2dsb2JhbACDIY1XAQEBCAoplFWHAg X-IronPort-AV: E=Sophos;i="4.25,458,1199660400"; d="scan'208";a="23460035" Received: from discorde.inria.fr ([192.93.2.38]) by mail4-smtp-sop.national.inria.fr with ESMTP; 07 Mar 2008 00:51:12 +0100 Received: from mail3-relais-sop.national.inria.fr (mail3-relais-sop.national.inria.fr [192.134.164.104]) by discorde.inria.fr (8.13.6/8.13.6) with ESMTP id m26NpCH6002874 (version=TLSv1/SSLv3 cipher=RC4-SHA bits=128 verify=OK) for ; Fri, 7 Mar 2008 00:51:12 +0100 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AkIDALsR0EfRVciugWdsb2JhbACDIY1XAQEIBQQJChaUVYcB X-IronPort-AV: E=Sophos;i="4.25,458,1199660400"; d="scan'208";a="9972418" Received: from wf-out-1314.google.com ([209.85.200.174]) by mail3-smtp-sop.national.inria.fr with ESMTP; 07 Mar 2008 00:51:11 +0100 Received: by wf-out-1314.google.com with SMTP id 26so188026wfd.0 for ; Thu, 06 Mar 2008 15:51:10 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:to:subject:cc:mime-version:content-type:content-transfer-encoding:content-disposition; bh=0DHURCzHtEi3cjL+ui7CXbJJ8N4Mno3i/bY86lfjVbU=; b=Zcw5tHYBzNdvkV5oOOPOryxT7dYGFYdPb5r3VrtYA3P9uu/V7KUPncrPEvczhWfdrEp/pxioBMNBiTHGmPwFjxT+ajBP6I7uQPXQJ/nNbhKZvvByJkhPyrcLzWgiuIDc3B+OINzcLSlrZ/4asr9L6RdQcAF6K1QI6OAB6mxhpcA= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:cc:mime-version:content-type:content-transfer-encoding:content-disposition; b=YCO55I7TctThujieX/9RO38OwQ3kAgBd8piuO62JqsgTr7spUnHdz2ahFDfYg0P4aGuEf0ey3rIk1LGXH2KItGIjjqDC/zTk9+/L/gfjS/EgZJNLdVPLDqgRS+elATWAxps9uJoMVV+VLbrn3PcmuotTE1aAekqxpJUVNsUtP8c= Received: by 10.142.154.20 with SMTP id b20mr386611wfe.143.1204847470284; Thu, 06 Mar 2008 15:51:10 -0800 (PST) Received: by 10.142.102.3 with HTTP; Thu, 6 Mar 2008 15:51:10 -0800 (PST) Message-ID: Date: Thu, 6 Mar 2008 18:51:10 -0500 From: "Markus Mottl" To: ocaml Subject: Global roots causing performance problems Cc: ocaml-users@janestcapital.com MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline X-Miltered: at discorde with ID 47D08370.000 by Joe's j-chkmail (http://j-chkmail . ensmp . fr)! X-Spam: no; 0.00; markus:01 mottl:01 markus:01 mottl:01 ocaml-values:01 ocaml-values:01 hash:01 integers:01 ocaml-value:01 hash:01 lablgtk:01 runtime:01 ocaml:01 garbage:01 closures:01 Hi, this is actually a somewhat long-standing issue, but we have only recently been hit by it in a way that makes working around it hard, and we would like to know whether there is a reasonable chance that this could be solved in a future OCaml-release: Many C-bindings use C-datastructures that refer to OCaml-values. This, of course, means that these OCaml-values must not be reclaimed as long as they can still be referred to. The current OCaml-FFI already provides functions (caml_register_global_root, etc.) for protecting such values. Unfortunately, it seems that this feature was not implemented with the possibility in mind that one day people might want to refer to many thousands of OCaml-values (most often closures) from C-data. The garbage collector obviously scans all global roots on every minor collection, which leads to a very substantial performance loss in such situations. The minor heap may often even be almost empty, i.e. we might only need to chase a few live values, while we would still have to scan through thousands of global roots. A common and unfortunately quite clumsy trick to work around this limitation is to use a hash table from integers to the values to be protected. Instead of storing the OCaml-value itself in the C-datastructure and protecting it, we only need to store the integer handle for the value, which does not need to be protected. C-land then only needs to perform a callback into OCaml-land with the integer handle in order to use the value it refers to, or to remove it it from the hash table if it isn't needed anymore. The problem here is that this only makes sense when writing a new library or when modifying a small existing one. We have run into this issue with LablGTK, which is just way too large and complex to make this a feasible thing to do. We therefore wonder whether it wouldn't be much more effective to fix the runtime. I don't know the exact details of how things currently work, but I guess that it would be possible to have two separate sets of global roots for the minor and major heap. Then, once a value gets oldified, the global root, too, could wander to the corresponding set. The set for the major heap could then be scanned only once per full major cycle, maybe even in slices, too. Would this suggestion be easy to implement? Best regards, Markus -- Markus Mottl http://www.ocaml.info markus.mottl@gmail.com