From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail1-relais-roc.national.inria.fr (mail1-relais-roc.national.inria.fr [192.134.164.82]) by sympa.inria.fr (Postfix) with ESMTPS id EC25C7EEAF for ; Thu, 17 Jan 2013 18:05:05 +0100 (CET) Received-SPF: None (mail1-smtp-roc.national.inria.fr: no sender authenticity information available from domain of nicholas.r.lucaroni@gmail.com) identity=pra; client-ip=209.85.216.179; receiver=mail1-smtp-roc.national.inria.fr; envelope-from="nicholas.r.lucaroni@gmail.com"; x-sender="nicholas.r.lucaroni@gmail.com"; x-conformance=sidf_compatible Received-SPF: Pass (mail1-smtp-roc.national.inria.fr: domain of nicholas.r.lucaroni@gmail.com designates 209.85.216.179 as permitted sender) identity=mailfrom; client-ip=209.85.216.179; receiver=mail1-smtp-roc.national.inria.fr; envelope-from="nicholas.r.lucaroni@gmail.com"; x-sender="nicholas.r.lucaroni@gmail.com"; x-conformance=sidf_compatible; x-record-type="v=spf1" Received-SPF: None (mail1-smtp-roc.national.inria.fr: no sender authenticity information available from domain of postmaster@mail-qc0-f179.google.com) identity=helo; client-ip=209.85.216.179; receiver=mail1-smtp-roc.national.inria.fr; envelope-from="nicholas.r.lucaroni@gmail.com"; x-sender="postmaster@mail-qc0-f179.google.com"; x-conformance=sidf_compatible X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AgQCAFQu+FDRVdizk2dsb2JhbABEgkioeYk/AYkcCBYOAQEBAQkJCwkUBCOCHgEBBAFAAQgTEgsBAwELBgULAwoNISEBAREBBQEKEgYTEod0AQMJBgydM4w0gnuBBoNxChknAwpZh1EBBQyLfIUwA4hhi1WBVoEcihuDMRYpglGBZA X-IronPort-AV: E=Sophos;i="4.84,486,1355094000"; d="scan'208";a="190433848" Received: from mail-qc0-f179.google.com ([209.85.216.179]) by mail1-smtp-roc.national.inria.fr with ESMTP/TLS/RC4-SHA; 17 Jan 2013 18:05:04 +0100 Received: by mail-qc0-f179.google.com with SMTP id b14so1777789qcs.38 for ; Thu, 17 Jan 2013 09:05:03 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:cc:content-type; bh=uh6vr3VdNKS5OW4wXM9H+tvHOPNEuXCFnFQmRGTJDSo=; b=CsF5aY2i+9chzudpERl7eSzMEtkB5h6ETRBnxqcNmXGRlOLXyrWo2/1knfY3M+ddEP XxBCJSxX6P0vqtqewYuFkrJqDsOKDgXKbEosS4bIwO9LtjcoogvlYKvwCcg3Nu41o/8N gvlGjwfesKNY5QGfO8BQs7K6bHgreGeOsmnnpEwcsUPdP1MmZ3NZ3stqgFSNZv4xLSig BcCIzNdgfk/cF4khb6xOzBIIYPhoc92bzJ4+oGlPwIAaI4/Toz0tibPuOKINYQtb+Ytd 9444lEygy+WG5yoxF5o8Zf4+oFEfAWe6JcKYlarjSBivlSPvaEiOrpzXOcjrjgCabHBS gAhA== MIME-Version: 1.0 X-Received: by 10.229.177.71 with SMTP id bh7mr1424522qcb.114.1358442303362; Thu, 17 Jan 2013 09:05:03 -0800 (PST) Received: by 10.49.51.136 with HTTP; Thu, 17 Jan 2013 09:05:03 -0800 (PST) In-Reply-To: References: <50F81A48.4080901@cs.cornell.edu> Date: Thu, 17 Jan 2013 12:05:03 -0500 Message-ID: From: Nicholas Lucaroni To: Edgar Friendly Cc: Jean-Baptiste Jeannin , "caml-list@inria.fr" Content-Type: multipart/alternative; boundary=001636e1ee24f095b404d37efe35 Subject: Re: [Caml-list] Hash function: complexity and circular structures --001636e1ee24f095b404d37efe35 Content-Type: text/plain; charset=ISO-8859-1 You should update to OCaml 4.xx I don't think the hash function works at all on lists in 3.12.1 (in the sense that it, as you've found, it always returns 0). in ocaml 4.00+rc1 # Hashtbl.hash (6 :: 7 :: 8 :: 9 :: 10 :: 11 :: 12 :: 13:: 14 :: 15 :: []);; - : int = 418135511 in ocaml 3.12.1 # Hashtbl.hash (6 :: 7 :: 8 :: 9 :: 10 :: 11 :: 12 :: 13:: 14 :: 15 :: []);; - : int = 0 On Thu, Jan 17, 2013 at 11:41 AM, Edgar Friendly wrote: > The source is quite easy to read, and is available here: > > https://github.com/ocaml/ocaml/blob/trunk/byterun/hash.c > > It turns out that even seeded MurmurHash3 does not prevent algorithmic > attacks as expected, as demonstrated and explained at 29c3: > http://events.ccc.de/congress/2012/Fahrplan/events/5152.en.html and > www.youtube.com/watch?v=Vdrab3sB7MU. > > E. > > > On Thu, Jan 17, 2013 at 10:46 AM, Nicholas Lucaroni < > nicholas.r.lucaroni@gmail.com> wrote: > >> https://sympa.inria.fr/sympa/arc/caml-list/2011-07/msg00117.html >> >> That thread talks about the previous and a better alternative (which is >> in 4.x). >> >> Xavier had said, >> *The SVN trunk for OCaml contains a complete reimplementation of the* >> *generic hash function that addresses both issues: lists and other >> complex keys are traversed breadth-first in a more cautious manner >> than before, and the mixing functions are based on MurmurHash 3, which >> exhibits very good statistical properties. All this should go in the >> next major release 3.13. So, everyone with an interest in efficient >> hash tables is welcome to try the trunk sources and let me know of any >> problems encountered.* >> >> On Thu, Jan 17, 2013 at 10:35 AM, Jean-Baptiste Jeannin < >> jeannin@cs.cornell.edu> wrote: >> >>> Hello, >>> >>> The default OCaml function (Hashtbl.hash) says that it "always >>> terminates, even on cyclic structures.". I would be curious to know what >>> its complexity is, both on a finite list and on a cyclic list (created by >>> let rec l = 1 :: l for example). Is the algorithm that is being used >>> published anywhere? >>> >>> Also, this hashing function seems to be returning 0 on any cyclic list >>> (at least the ones I tried). Is this normal? Does anyone know any better >>> hashing function on cyclic lists? >>> >>> # Hashtbl.hash [ 1 ; 2 ];; >>> - : int = 131199 >>> # let rec ones = 1 :: ones;; >>> val ones : int list = >>> [1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; >>> 1; 1; >>> 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; >>> 1; 1; >>> 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; >>> 1; 1; >>> 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; >>> 1; 1; >>> 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; >>> 1; 1; >>> 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; >>> 1; 1; >>> 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; >>> 1; 1; >>> 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; >>> 1; 1; >>> 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; >>> 1; 1; >>> 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; >>> 1; 1; >>> 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; >>> 1; 1; >>> 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; >>> 1; >>> ...] >>> # Hashtbl.hash ones;; >>> - : int = 0 >>> # Hashtbl.hash (5 :: 4 :: ones);; >>> - : int = 0 >>> >>> Thank you, >>> Jean-Baptiste Jeannin >>> >>> -- >>> Caml-list mailing list. Subscription management and archives: >>> https://sympa.inria.fr/sympa/**arc/caml-list >>> Beginner's list: http://groups.yahoo.com/group/**ocaml_beginners >>> Bug reports: http://caml.inria.fr/bin/caml-**bugs >>> >> >> > --001636e1ee24f095b404d37efe35 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable You should update to OCaml 4.xx I don't think the hash function works a= t all on lists in 3.12.1 (in the sense that it, as you've found, it alw= ays returns 0).

in ocaml 4.00+rc1
# Hasht= bl.hash (6 :: 7 :: 8 :: 9 :: 10 :: 11 :: 12 :: 13:: 14 :: 15 :: []);;
- : int =3D 418135511

in ocaml 3.1= 2.1
# Hashtbl.hash (6 :: 7 :: 8 :: 9 :: 10 :: 11 :: 12 :: 13:: 14= :: 15 :: []);;
- : int =3D 0



On Thu, Jan 17, 2= 013 at 11:41 AM, Edgar Friendly <thelema314@gmail.com> wr= ote:
The source is qui= te easy to read, and is available here:

https://github.c= om/ocaml/ocaml/blob/trunk/byterun/hash.c

It turns out that even seeded MurmurHash3 does not prevent algorithmi= c attacks as expected, as demonstrated and explained at 29c3: http://events.ccc.de/congress/2012/Fahrplan/events/5152.en.html a= nd www.youtube.com/watch?v=3DVdrab3sB7MU.

E.


On Thu, Jan 17, 2013 at 10:46 AM, Nicholas Lucaroni <= nicholas= .r.lucaroni@gmail.com> wrote:
https://sympa.inria.fr/s= ympa/arc/caml-list/2011-07/msg00117.html

That thread talks about the previous and a better alternativ= e (which is in 4.x).=A0

Xavier had said,
The=A0SVN=A0trunk=A0for=A0OCaml=A0contai= ns=A0a=A0complete=A0reimplementation=A0of=A0the
generic=A0ha= sh=A0function=A0that=A0addresses=A0both=A0issues:=A0lists=A0and=A0other
complex=A0keys= =A0are=A0traversed=A0breadth-first=A0in=A0a=A0more=A0cautious=A0manner
than=A0before,= =A0and=A0the=A0mixing=A0functions=A0are=A0based=A0on=A0MurmurHash=A03,=A0wh= ich
exhibits=A0very= =A0good=A0statistical=A0properties.=A0=A0All=A0this=A0should=A0go=A0in=A0th= e
next=A0major=A0= release=A03.13.=A0=A0So,=A0everyone=A0with=A0an=A0interest=A0in=A0efficient=
hash=A0tables= =A0is=A0welcome=A0to=A0try=A0the=A0trunk=A0sources=A0and=A0let=A0me=A0know= =A0of=A0any
problems=A0enco= untered.

On Thu, Jan 17, 2013 at 10:35 AM, Jean-Bapti= ste Jeannin <jeannin@cs.cornell.edu> wrote:
Hello,

The default OCaml function (Hashtbl.hash) says that it "always termina= tes, even on cyclic structures.". I would be curious to know what its = complexity is, both on a finite list and on a cyclic list (created by let r= ec l =3D 1 :: l for example). Is the algorithm that is being used published= anywhere?

Also, this hashing function seems to be returning 0 on any cyclic list (at = least the ones I tried). Is this normal? Does anyone know any better hashin= g function on cyclic lists?

# Hashtbl.hash [ 1 ; 2 ];;
- : int =3D 131199
# let rec ones =3D 1 :: ones;;
val ones : int list =3D
=A0 [1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1= ; 1;
=A0 =A01; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;= 1; 1;
=A0 =A01; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;= 1; 1;
=A0 =A01; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;= 1; 1;
=A0 =A01; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;= 1; 1;
=A0 =A01; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;= 1; 1;
=A0 =A01; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;= 1; 1;
=A0 =A01; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;= 1; 1;
=A0 =A01; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;= 1; 1;
=A0 =A01; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;= 1; 1;
=A0 =A01; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;= 1; 1;
=A0 =A01; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;= 1;
=A0 =A0...]
# Hashtbl.hash ones;;
- : int =3D 0
# Hashtbl.hash (5 :: 4 :: ones);;
- : int =3D 0

Thank you,
Jean-Baptiste Jeannin

--
Caml-list mailing list. =A0Subscription management and archives:
ht= tps://sympa.inria.fr/sympa/arc/caml-list
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners<= /a>
Bug reports:
http://caml.inria.fr/bin/caml-bugs



--001636e1ee24f095b404d37efe35--