From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail4-relais-sop.national.inria.fr (mail4-relais-sop.national.inria.fr [192.134.164.105]) by sympa.inria.fr (Postfix) with ESMTPS id B370B7EEAF for ; Thu, 17 Jan 2013 17:41:59 +0100 (CET) Received-SPF: None (mail4-smtp-sop.national.inria.fr: no sender authenticity information available from domain of thelema314@gmail.com) identity=pra; client-ip=209.85.214.50; receiver=mail4-smtp-sop.national.inria.fr; envelope-from="thelema314@gmail.com"; x-sender="thelema314@gmail.com"; x-conformance=sidf_compatible Received-SPF: Pass (mail4-smtp-sop.national.inria.fr: domain of thelema314@gmail.com designates 209.85.214.50 as permitted sender) identity=mailfrom; client-ip=209.85.214.50; receiver=mail4-smtp-sop.national.inria.fr; envelope-from="thelema314@gmail.com"; x-sender="thelema314@gmail.com"; x-conformance=sidf_compatible; x-record-type="v=spf1" Received-SPF: None (mail4-smtp-sop.national.inria.fr: no sender authenticity information available from domain of postmaster@mail-bk0-f50.google.com) identity=helo; client-ip=209.85.214.50; receiver=mail4-smtp-sop.national.inria.fr; envelope-from="thelema314@gmail.com"; x-sender="postmaster@mail-bk0-f50.google.com"; x-conformance=sidf_compatible X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AgUCAIgp+FDRVdYyjmdsb2JhbABEgkiDfaR8iT8BiRwIFg4BAQEBCQsJCRIGI4IeAQEEASMdAQgTEgsBAwELBgULDQ0dAgIhAQERAQUBChIGExKHdAEDCQYMnTSLZU+Ce4EGg3AKGScDClmHUQEFDIt8hB2BEwOIYYtVgVaBHIobgzEWKYJRgWQ X-IronPort-AV: E=Sophos;i="4.84,486,1355094000"; d="scan'208";a="168824776" Received: from mail-bk0-f50.google.com ([209.85.214.50]) by mail4-smtp-sop.national.inria.fr with ESMTP/TLS/RC4-SHA; 17 Jan 2013 17:41:58 +0100 Received: by mail-bk0-f50.google.com with SMTP id jf3so1489258bkc.9 for ; Thu, 17 Jan 2013 08:41:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:cc:content-type; bh=nGw3MruNYOng6PLAJp9x4IZEJ3n1+l0OCUSypKLh15g=; b=as6n1F78ZALnX10TwYm73HXuI3HcBv433r9UIF5MREqXg/ofEA+aR6rJYEEXi9g0gF jJLCNC+fCos0FUUQkJ38AF49hlDAYo+PZcf1jTamiRBXX07RVsZGR98kMHwL8XDN2Y3g lyGY8q7Y6B7uwEdUoeaAcb3yAOJDuHj1stQy4QUq4lXFVC6g3DdjLk92dM4NVSX7o14x t/VBQoJ7kheU1TfbY9Oi2Yh3cfBibfPIGVMTGYTSNXuRzKhA0lK8QYzbESFOouUBCXH9 IjGzscGm9uCuL4VV6oVcE21osBxK7YKWOVmH3q+eUKH+7HA+Gmm3+WkxLMSvKRKO2ux4 ETTA== MIME-Version: 1.0 X-Received: by 10.204.8.136 with SMTP id h8mr1688682bkh.62.1358440918184; Thu, 17 Jan 2013 08:41:58 -0800 (PST) Received: by 10.204.109.13 with HTTP; Thu, 17 Jan 2013 08:41:58 -0800 (PST) In-Reply-To: References: <50F81A48.4080901@cs.cornell.edu> Date: Thu, 17 Jan 2013 11:41:58 -0500 Message-ID: From: Edgar Friendly To: Nicholas Lucaroni Cc: Jean-Baptiste Jeannin , "caml-list@inria.fr" Content-Type: multipart/alternative; boundary=00151758844660754304d37eac52 Subject: Re: [Caml-list] Hash function: complexity and circular structures --00151758844660754304d37eac52 Content-Type: text/plain; charset=UTF-8 The source is quite easy to read, and is available here: https://github.com/ocaml/ocaml/blob/trunk/byterun/hash.c It turns out that even seeded MurmurHash3 does not prevent algorithmic attacks as expected, as demonstrated and explained at 29c3: http://events.ccc.de/congress/2012/Fahrplan/events/5152.en.html and www.youtube.com/watch?v=Vdrab3sB7MU. E. On Thu, Jan 17, 2013 at 10:46 AM, Nicholas Lucaroni < nicholas.r.lucaroni@gmail.com> wrote: > https://sympa.inria.fr/sympa/arc/caml-list/2011-07/msg00117.html > > That thread talks about the previous and a better alternative (which is in > 4.x). > > Xavier had said, > *The SVN trunk for OCaml contains a complete reimplementation of the* > *generic hash function that addresses both issues: lists and other > complex keys are traversed breadth-first in a more cautious manner > than before, and the mixing functions are based on MurmurHash 3, which > exhibits very good statistical properties. All this should go in the > next major release 3.13. So, everyone with an interest in efficient > hash tables is welcome to try the trunk sources and let me know of any > problems encountered.* > > On Thu, Jan 17, 2013 at 10:35 AM, Jean-Baptiste Jeannin < > jeannin@cs.cornell.edu> wrote: > >> Hello, >> >> The default OCaml function (Hashtbl.hash) says that it "always >> terminates, even on cyclic structures.". I would be curious to know what >> its complexity is, both on a finite list and on a cyclic list (created by >> let rec l = 1 :: l for example). Is the algorithm that is being used >> published anywhere? >> >> Also, this hashing function seems to be returning 0 on any cyclic list >> (at least the ones I tried). Is this normal? Does anyone know any better >> hashing function on cyclic lists? >> >> # Hashtbl.hash [ 1 ; 2 ];; >> - : int = 131199 >> # let rec ones = 1 :: ones;; >> val ones : int list = >> [1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; >> 1; 1; >> 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; >> 1; 1; >> 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; >> 1; 1; >> 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; >> 1; 1; >> 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; >> 1; 1; >> 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; >> 1; 1; >> 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; >> 1; 1; >> 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; >> 1; 1; >> 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; >> 1; 1; >> 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; >> 1; 1; >> 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; >> 1; 1; >> 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; >> ...] >> # Hashtbl.hash ones;; >> - : int = 0 >> # Hashtbl.hash (5 :: 4 :: ones);; >> - : int = 0 >> >> Thank you, >> Jean-Baptiste Jeannin >> >> -- >> Caml-list mailing list. Subscription management and archives: >> https://sympa.inria.fr/sympa/**arc/caml-list >> Beginner's list: http://groups.yahoo.com/group/**ocaml_beginners >> Bug reports: http://caml.inria.fr/bin/caml-**bugs >> > > --00151758844660754304d37eac52 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
The source is quite easy to read, and is availab= le here:

https://github.com/ocaml/ocaml/blob/trunk/byterun/hash.c
<= br>
It turns out that even seeded MurmurHash3 does not prevent algorithmi= c attacks as expected, as demonstrated and explained at 29c3: http://event= s.ccc.de/congress/2012/Fahrplan/events/5152.en.html and www.youtube.com/watch?v=3DVdrab3s= B7MU.

E.


On Thu, Jan 17, 2013 at 10:46 AM, Nicholas Lucaroni <= span dir=3D"ltr"><nicholas.r.lucaroni@gmail.com> wrote:
https://sympa.inria.fr/s= ympa/arc/caml-list/2011-07/msg00117.html

That thread talks about the previous and a better alternativ= e (which is in 4.x).=C2=A0

Xavier had said,
The=C2=A0SVN=C2=A0trunk=C2=A0for=C2=A0OC= aml=C2=A0contains=C2=A0a=C2=A0complete=C2=A0reimplementation=C2=A0of=C2=A0t= he
generic=C2= =A0hash=C2=A0function=C2=A0that=C2=A0addresses=C2=A0both=C2=A0issues:=C2=A0= lists=C2=A0and=C2=A0other
complex=C2=A0ke= ys=C2=A0are=C2=A0traversed=C2=A0breadth-first=C2=A0in=C2=A0a=C2=A0more=C2= =A0cautious=C2=A0manner
than=C2=A0befor= e,=C2=A0and=C2=A0the=C2=A0mixing=C2=A0functions=C2=A0are=C2=A0based=C2=A0on= =C2=A0MurmurHash=C2=A03,=C2=A0which
exhibits=C2=A0v= ery=C2=A0good=C2=A0statistical=C2=A0properties.=C2=A0=C2=A0All=C2=A0this=C2= =A0should=C2=A0go=C2=A0in=C2=A0the
next=C2=A0major= =C2=A0release=C2=A03.13.=C2=A0=C2=A0So,=C2=A0everyone=C2=A0with=C2=A0an=C2= =A0interest=C2=A0in=C2=A0efficient
hash=C2=A0table= s=C2=A0is=C2=A0welcome=C2=A0to=C2=A0try=C2=A0the=C2=A0trunk=C2=A0sources=C2= =A0and=C2=A0let=C2=A0me=C2=A0know=C2=A0of=C2=A0any
problems=C2=A0e= ncountered.

On Thu, Jan 17, 2013 at 10:35 AM, Jean-Bapti= ste Jeannin <jeannin@cs.cornell.edu> wrote:
Hello,

The default OCaml function (Hashtbl.hash) says that it "always termina= tes, even on cyclic structures.". I would be curious to know what its = complexity is, both on a finite list and on a cyclic list (created by let r= ec l =3D 1 :: l for example). Is the algorithm that is being used published= anywhere?

Also, this hashing function seems to be returning 0 on any cyclic list (at = least the ones I tried). Is this normal? Does anyone know any better hashin= g function on cyclic lists?

# Hashtbl.hash [ 1 ; 2 ];;
- : int =3D 131199
# let rec ones =3D 1 :: ones;;
val ones : int list =3D
=C2=A0 [1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1= ; 1; 1;
=C2=A0 =C2=A01; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;= 1; 1; 1; 1;
=C2=A0 =C2=A01; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;= 1; 1; 1; 1;
=C2=A0 =C2=A01; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;= 1; 1; 1; 1;
=C2=A0 =C2=A01; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;= 1; 1; 1; 1;
=C2=A0 =C2=A01; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;= 1; 1; 1; 1;
=C2=A0 =C2=A01; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;= 1; 1; 1; 1;
=C2=A0 =C2=A01; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;= 1; 1; 1; 1;
=C2=A0 =C2=A01; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;= 1; 1; 1; 1;
=C2=A0 =C2=A01; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;= 1; 1; 1; 1;
=C2=A0 =C2=A01; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;= 1; 1; 1; 1;
=C2=A0 =C2=A01; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;= 1; 1; 1;
=C2=A0 =C2=A0...]
# Hashtbl.hash ones;;
- : int =3D 0
# Hashtbl.hash (5 :: 4 :: ones);;
- : int =3D 0

Thank you,
Jean-Baptiste Jeannin

--
Caml-list mailing list. =C2=A0Subscription management and archives:
ht= tps://sympa.inria.fr/sympa/arc/caml-list
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners<= /a>
Bug reports:
http://caml.inria.fr/bin/caml-bugs


--00151758844660754304d37eac52--