From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on yquem.inria.fr X-Spam-Level: * X-Spam-Status: No, score=1.1 required=5.0 tests=AWL,DNS_FROM_RFC_ABUSE, SPF_SOFTFAIL autolearn=disabled version=3.1.3 Received: from mail2-relais-roc.national.inria.fr (mail2-relais-roc.national.inria.fr [192.134.164.83]) by yquem.inria.fr (Postfix) with ESMTP id 9BF14BC6B for ; Fri, 11 Jan 2008 01:15:36 +0100 (CET) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: Ao8CAF9DhkeCNhAB/2dsb2JhbACpdA X-IronPort-AV: E=Sophos;i="4.24,269,1196636400"; d="scan'208";a="5925310" Received: from kurims.kurims.kyoto-u.ac.jp ([130.54.16.1]) by mail2-smtp-roc.national.inria.fr with ESMTP; 11 Jan 2008 01:15:35 +0100 Received: from localhost (orion [130.54.16.5]) by kurims.kurims.kyoto-u.ac.jp (8.13.8/8.13.8) with ESMTP id m0B0FPBs009425; Fri, 11 Jan 2008 09:15:26 +0900 (JST) Date: Fri, 11 Jan 2008 09:15:19 +0900 (JST) Message-Id: <20080111.091519.45987428.garrigue@math.nagoya-u.ac.jp> To: jon@ffconsultancy.com Cc: caml-list@yquem.inria.fr Subject: Re: [Caml-list] Hash clash in polymorphic variants From: Jacques Garrigue In-Reply-To: <200801101709.14110.jon@ffconsultancy.com> References: <200801101709.14110.jon@ffconsultancy.com> X-Mailer: Mew version 4.2 on Emacs 22.1 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Spam: no; 0.00; hash:01 variants:01 constructors:01 avoided:01 clashing:01 hash:01 variants:01 hashes:01 clashes:01 ocamlc:01 translating:01 enum:01 clashes:01 cheers:01 polymorphic:01 From: Jon Harrop > ISTR advice that constructors sharing the first few characters should be > avoided in order to reduce the likelihood of clashing hash values for > polymorphic variants. Is that right? Not at all. If the first characters are identical it just means that an identical value will be added to the hashes of the suffixes, which actually means that you lower the probability of getting conflicts :-) The hash functions guarantees that all keys of strictly less than 5 characters will map to different. The probability of getting clashes being really low, you should not be concerned by this. Just check aferwards. A simple way to do it is to produce a big type containing all the tags, and feed it to ocamlc. > I'm interested in automatically translating the GL_* enum from > OpenGL into polymorphic variants. So although it is generated code I > have little control over it, e.g. I cannot change the translation as > OpenGL gets extended because code will already be using the existing > names. In the event you get a conflict when openGL is extended, you can still add a special case for the newly added tags. I hope this does not happen, but the birthday theorem tells you that when you get enough participants, clashes are hard to avoid. Cheers, Jacques Garrigue