Mailing list for all users of the OCaml language and system.
 help / color / mirror / Atom feed
From: Edgar Friendly <thelema314@gmail.com>
To: Nicholas Lucaroni <nicholas.r.lucaroni@gmail.com>
Cc: Jean-Baptiste Jeannin <jeannin@cs.cornell.edu>,
	"caml-list@inria.fr" <caml-list@inria.fr>
Subject: Re: [Caml-list] Hash function: complexity and circular structures
Date: Thu, 17 Jan 2013 11:41:58 -0500	[thread overview]
Message-ID: <CAL-jcAmC_Jb1x3AvLJvQVA-BtEynds31mGh+VsN5DfJe8LAJbQ@mail.gmail.com> (raw)
In-Reply-To: <CAB6W5F5ZKD0UT5opfQydKs2XQWeHDdb904XUdh9HS=MgO9rNZg@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 3399 bytes --]

The source is quite easy to read, and is available here:

https://github.com/ocaml/ocaml/blob/trunk/byterun/hash.c

It turns out that even seeded MurmurHash3 does not prevent algorithmic
attacks as expected, as demonstrated and explained at 29c3:
http://events.ccc.de/congress/2012/Fahrplan/events/5152.en.html and
www.youtube.com/watch?v=Vdrab3sB7MU.

E.


On Thu, Jan 17, 2013 at 10:46 AM, Nicholas Lucaroni <
nicholas.r.lucaroni@gmail.com> wrote:

> https://sympa.inria.fr/sympa/arc/caml-list/2011-07/msg00117.html
>
> That thread talks about the previous and a better alternative (which is in
> 4.x).
>
> Xavier had said,
> *The SVN trunk for OCaml contains a complete reimplementation of the*
> *generic hash function that addresses both issues: lists and other
> complex keys are traversed breadth-first in a more cautious manner
> than before, and the mixing functions are based on MurmurHash 3, which
> exhibits very good statistical properties.  All this should go in the
> next major release 3.13.  So, everyone with an interest in efficient
> hash tables is welcome to try the trunk sources and let me know of any
> problems encountered.*
>
> On Thu, Jan 17, 2013 at 10:35 AM, Jean-Baptiste Jeannin <
> jeannin@cs.cornell.edu> wrote:
>
>> Hello,
>>
>> The default OCaml function (Hashtbl.hash) says that it "always
>> terminates, even on cyclic structures.". I would be curious to know what
>> its complexity is, both on a finite list and on a cyclic list (created by
>> let rec l = 1 :: l for example). Is the algorithm that is being used
>> published anywhere?
>>
>> Also, this hashing function seems to be returning 0 on any cyclic list
>> (at least the ones I tried). Is this normal? Does anyone know any better
>> hashing function on cyclic lists?
>>
>> # Hashtbl.hash [ 1 ; 2 ];;
>> - : int = 131199
>> # let rec ones = 1 :: ones;;
>> val ones : int list =
>>   [1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
>> 1; 1;
>>    1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
>> 1; 1;
>>    1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
>> 1; 1;
>>    1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
>> 1; 1;
>>    1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
>> 1; 1;
>>    1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
>> 1; 1;
>>    1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
>> 1; 1;
>>    1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
>> 1; 1;
>>    1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
>> 1; 1;
>>    1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
>> 1; 1;
>>    1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
>> 1; 1;
>>    1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
>>    ...]
>> # Hashtbl.hash ones;;
>> - : int = 0
>> # Hashtbl.hash (5 :: 4 :: ones);;
>> - : int = 0
>>
>> Thank you,
>> Jean-Baptiste Jeannin
>>
>> --
>> Caml-list mailing list.  Subscription management and archives:
>> https://sympa.inria.fr/sympa/**arc/caml-list<https://sympa.inria.fr/sympa/arc/caml-list>
>> Beginner's list: http://groups.yahoo.com/group/**ocaml_beginners<http://groups.yahoo.com/group/ocaml_beginners>
>> Bug reports: http://caml.inria.fr/bin/caml-**bugs<http://caml.inria.fr/bin/caml-bugs>
>>
>
>

[-- Attachment #2: Type: text/html, Size: 8018 bytes --]

  reply	other threads:[~2013-01-17 16:41 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-01-17 15:35 Jean-Baptiste Jeannin
2013-01-17 15:46 ` Nicholas Lucaroni
2013-01-17 16:41   ` Edgar Friendly [this message]
2013-01-17 17:05     ` Nicholas Lucaroni
2013-01-18 17:46       ` Jean-Baptiste Jeannin
2013-01-18 18:27         ` Gabriel Scherer
2013-01-18 20:11           ` Nick Lucaroni

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAL-jcAmC_Jb1x3AvLJvQVA-BtEynds31mGh+VsN5DfJe8LAJbQ@mail.gmail.com \
    --to=thelema314@gmail.com \
    --cc=caml-list@inria.fr \
    --cc=jeannin@cs.cornell.edu \
    --cc=nicholas.r.lucaroni@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox