From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on yquem.inria.fr X-Spam-Level: X-Spam-Status: No, score=0.0 required=5.0 tests=none autolearn=disabled version=3.1.3 Received: from mail3-relais-sop.national.inria.fr (mail3-relais-sop.national.inria.fr [192.134.164.104]) by yquem.inria.fr (Postfix) with ESMTP id 63B29BC6C for ; Fri, 11 Jan 2008 19:41:09 +0100 (CET) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AgAAAINHh0fC2fJXnmdsb2JhbACQHgEBAQEHBAYpgRSYUw X-IronPort-AV: E=Sophos;i="4.24,272,1196636400"; d="scan'208";a="7714827" Received: from anchor-post-37.mail.demon.net ([194.217.242.87]) by mail3-smtp-sop.national.inria.fr with ESMTP; 11 Jan 2008 19:41:09 +0100 Received: from orion.metastack.com ([80.177.38.218]) by anchor-post-37.mail.demon.net with esmtp (Exim 4.68) id 1JDOoW-0003HL-Nx for caml-list@yquem.inria.fr; Fri, 11 Jan 2008 18:41:08 +0000 Received: from countertenor (cpc3-cmbg6-0-0-cust100.cmbg.cable.ntl.com [81.101.140.101]) (authenticated bits=0) by orion.metastack.com (8.13.4/8.13.3) with ESMTP id m0BISDwr007963 (version=TLSv1/SSLv3 cipher=RC4-MD5 bits=128 verify=NO) for ; Fri, 11 Jan 2008 18:28:13 GMT From: "David Allsopp" To: References: <200801101709.14110.jon@ffconsultancy.com><200801110830.29988.ober.14@osu.edu><200801111348.15452.jon@ffconsultancy.com> <200801111114.03876.ober.14@osu.edu> Subject: RE: [Caml-list] Hash clash in polymorphic variants Date: Fri, 11 Jan 2008 18:40:53 -0000 Organization: MetaStack Solutions Ltd. Message-ID: <000c01c85481$7f727120$6c7ba8c0@countertenor> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Mailer: Microsoft Office Outlook 11 In-Reply-To: <200801111114.03876.ober.14@osu.edu> X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.3198 Thread-Index: AchUa0LbdXgC18mOTGuR0FoZm6H4ZwAFKoog X-Scanned-By: MIMEDefang 2.63 on 172.16.28.218 X-Spam: no; 0.00; hash:01 variants:01 non-issue:01 show-stopper:01 hash:01 variants:01 compiler:01 incorrectly:01 checker:01 polymorphic:01 polymorphic:01 wrote:01 wrote:01 caml-list:01 caml:02 Kuba Ober wrote: > On Friday 11 January 2008, Jon Harrop wrote: > > On Friday 11 January 2008 13:30:29 Kuba Ober wrote: > > > Are those collisions of any real importance? I mean, do they break > > > anything? If all they do is imply linearly searching a list of a few > > > elements, for the colliding entry, then it's a non-issue? > > > > It would prevent code from compiling so it would be a complete > > show-stopper. > > So what you're saying is that the implementation uses the hash with bucket > size of 1? That's kinda poor decision, methinks. I think you're missing the context - there's no hash table. See 18.3.6 in the manual - the hashed values (and resulting collisions) are to do with the internal representation of polymorphic variants. The compiler cannot process code that uses two polymorphic variants whose tag names will have the same internal representation (and therefore be incorrectly viewed as having the same value). The test is probably performed somewhere in the type checker... An alternative implementation might have been to lookup the tags (in a perfect hash table) using a system similar to caml_named_value but I imagine that the present method was preferred because it's simpler (and quite possibly faster) and collisions are rare (as Eric pointed out) - although in Jon's case the lack of a guarantee is unfortunate. Incidentally, and off-the-subject here, using a hash table with a bucket size of 1 is very important if you need performance guarantees on your hash table and have some other way of coping with collisions. David