From mboxrd@z Thu Jan 1 00:00:00 1970 Received: (from weis@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id SAA14874 for caml-redistribution@pauillac.inria.fr; Fri, 25 Feb 2000 18:06:27 +0100 (MET) Resent-Message-Id: <200002251706.SAA14874@pauillac.inria.fr> Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id RAA25222 for ; Fri, 25 Feb 2000 17:27:55 +0100 (MET) Received: from ext.lri.fr (ext.lri.fr [129.175.15.4]) by nez-perce.inria.fr (8.8.7/8.8.7) with ESMTP id RAA05358 for ; Fri, 25 Feb 2000 17:27:54 +0100 (MET) Received: from pc87.lri.fr (pc87 [129.175.8.106]) by ext.lri.fr (8.9.3/jtpda-5.3.2) with ESMTP id RAA19567 ; Fri, 25 Feb 2000 17:27:53 +0100 (MET) Received: by pc87.lri.fr (8.9.3/feuille) id RAA14542 ; Fri, 25 Feb 2000 17:26:52 +0100 X-Authentication-Warning: pc87: jcourant set sender to jcourant@pc87.lri.fr using -f From: Judicael Courant MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit Message-ID: <14518.44303.707274.695470@pc87> Date: Fri, 25 Feb 2000 17:25:51 +0100 (MET) To: Stephan Houben Cc: caml-list@inria.fr Subject: Re: symbol managment in Caml In-Reply-To: <20000225131622.A4595@pcrm.win.tue.nl> References: <20000225131622.A4595@pcrm.win.tue.nl> X-Mailer: VM 6.62 under Emacs 20.3.1 Resent-From: weis@pauillac.inria.fr Resent-Date: Fri, 25 Feb 2000 18:06:27 +0100 Resent-To: caml-redistribution@pauillac.inria.fr Hi, In his message of Fri February 25, 2000, Stephan Houben writes: > > The interpreter often needs to search a hash table with a string > as key. A common optimization for this is to use pointer identity > instead of string equality, and "intern" every string before using > it as a key to the hash table. > > Unfortunately, I don't see how to do this with the current O'Caml > libs. You can do identity checks with ==, but there seems no way > to get a good hash corresponding to ==. > > Has anyone already written some code for this task? > I guess it is very difficult to provide a good hash function corresponding to == because the GC may move the block the string has been allocated to at any time. However, in older versions of Coq (http://coq.inria.fr) there was some code achieving what you are looking for : roughly, identifiers were internally represented as (unique) integers and we had two conversion functions ident_of_string and string_of_ident (each time a new string was interned through ident_of_string, a new integer was allocated). Therefore, ident comparison was integer comparison, and we had a hash function over idents. NB : after a while, the code implementing this was removed from Coq (we spent more time (re)interning terms than we saved...) Judicaël Courant. -- Judicael.Courant@lri.fr, http://www.lri.fr/~jcourant/ (+33) (0)1 69 15 64 85 "Montre moi des morceaux de ton monde, et je te montrerai le mien" Tim, matricule #929, condamné ŕ mort. http://rozenn.picard.free.fr/tim.html