From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on yquem.inria.fr X-Spam-Level: X-Spam-Status: No, score=0.0 required=5.0 tests=none autolearn=disabled version=3.1.3 Received: from mail2-relais-roc.national.inria.fr (mail2-relais-roc.national.inria.fr [192.134.164.83]) by yquem.inria.fr (Postfix) with ESMTP id B5463BC69 for ; Mon, 14 Jan 2008 17:03:37 +0100 (CET) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AgAAAIMWi0fC2fJXnmdsb2JhbACQCgEBAQEHCimYDA X-IronPort-AV: E=Sophos;i="4.24,283,1196636400"; d="scan'208";a="6023530" Received: from anchor-post-37.mail.demon.net ([194.217.242.87]) by mail2-smtp-roc.national.inria.fr with ESMTP; 14 Jan 2008 17:03:37 +0100 Received: from orion.metastack.com ([80.177.38.218]) by anchor-post-37.mail.demon.net with esmtp (Exim 4.68) id 1JERmi-0003ox-P7 for caml-list@yquem.inria.fr; Mon, 14 Jan 2008 16:03:36 +0000 Received: from countertenor (cpc3-cmbg6-0-0-cust100.cmbg.cable.ntl.com [81.101.140.101]) (authenticated bits=0) by orion.metastack.com (8.13.4/8.13.3) with ESMTP id m0EFoPDK012379 (version=TLSv1/SSLv3 cipher=RC4-MD5 bits=128 verify=NO) for ; Mon, 14 Jan 2008 15:50:26 GMT From: "David Allsopp" To: References: <200801101709.14110.jon@ffconsultancy.com><200801140956.25449.ober.14@osu.edu><001401c856c3$6c692260$6c7ba8c0@countertenor> <200801141044.05670.ober.14@osu.edu> Subject: RE: [Caml-list] Re: Hash clash in polymorphic variants Date: Mon, 14 Jan 2008 16:03:35 -0000 Organization: MetaStack Solutions Ltd. Message-ID: <001501c856c7$051daa50$6c7ba8c0@countertenor> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Mailer: Microsoft Office Outlook 11 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.3198 Thread-Index: AchWwn68eLxXSwkPTgGmA+kra5O0YwAAuhIw In-Reply-To: <200801141044.05670.ober.14@osu.edu> X-Scanned-By: MIMEDefang 2.63 on 172.16.28.218 X-Spam: no; 0.00; hash:01 variants:01 trivial:01 dlopen:01 hashes:01 clashes:01 polymorphic:01 wrote:01 integer:01 caml-list:01 algorithm:01 slower:02 string:02 string:02 slightly:03 Kuba Ober wrote: > A trivial solution to that is to keep both, as obviously each time an > equivalent of dlopen() is made, everything has to be rehashed. gperf > is "slightly" memory-hungry, so surely it'd need to be something using a > different algorithm. I'm talking hypothetically, but I also think it's a > weird design decision to use those possibly-colliding hashes. I agree that it's a bit weird - but the clashes are very rare (and the function was designed to keep them rare for "normal" usage). > String sorting/comparison isn't exactly a CPU killer, so couldn't the > original names have been used instead? String comparison is much slower than integer comparison... we're talking about one CPU instruction compared to a for loop! Jon would never use them again :o) Not to mention the storage overhead of keeping the tag names in memory - not great if you've got long lists of `YetAnotherTag. David