Mailing list for all users of the OCaml language and system.
 help / color / mirror / Atom feed
From: Nicholas Lucaroni <nicholas.r.lucaroni@gmail.com>
To: Edgar Friendly <thelema314@gmail.com>
Cc: Jean-Baptiste Jeannin <jeannin@cs.cornell.edu>,
	"caml-list@inria.fr" <caml-list@inria.fr>
Subject: Re: [Caml-list] Hash function: complexity and circular structures
Date: Thu, 17 Jan 2013 12:05:03 -0500	[thread overview]
Message-ID: <CAB6W5F4-jfWgem1+f+khsbzD-b6ZG4+F8X7VYDjQM7ax8gTimw@mail.gmail.com> (raw)
In-Reply-To: <CAL-jcAmC_Jb1x3AvLJvQVA-BtEynds31mGh+VsN5DfJe8LAJbQ@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 3965 bytes --]

You should update to OCaml 4.xx I don't think the hash function works at
all on lists in 3.12.1 (in the sense that it, as you've found, it always
returns 0).

in ocaml 4.00+rc1
# Hashtbl.hash (6 :: 7 :: 8 :: 9 :: 10 :: 11 :: 12 :: 13:: 14 :: 15 :: []);;
- : int = 418135511

in ocaml 3.12.1
# Hashtbl.hash (6 :: 7 :: 8 :: 9 :: 10 :: 11 :: 12 :: 13:: 14 :: 15 :: []);;
- : int = 0



On Thu, Jan 17, 2013 at 11:41 AM, Edgar Friendly <thelema314@gmail.com>wrote:

> The source is quite easy to read, and is available here:
>
> https://github.com/ocaml/ocaml/blob/trunk/byterun/hash.c
>
> It turns out that even seeded MurmurHash3 does not prevent algorithmic
> attacks as expected, as demonstrated and explained at 29c3:
> http://events.ccc.de/congress/2012/Fahrplan/events/5152.en.html and
> www.youtube.com/watch?v=Vdrab3sB7MU.
>
> E.
>
>
> On Thu, Jan 17, 2013 at 10:46 AM, Nicholas Lucaroni <
> nicholas.r.lucaroni@gmail.com> wrote:
>
>> https://sympa.inria.fr/sympa/arc/caml-list/2011-07/msg00117.html
>>
>> That thread talks about the previous and a better alternative (which is
>> in 4.x).
>>
>> Xavier had said,
>> *The SVN trunk for OCaml contains a complete reimplementation of the*
>> *generic hash function that addresses both issues: lists and other
>> complex keys are traversed breadth-first in a more cautious manner
>> than before, and the mixing functions are based on MurmurHash 3, which
>> exhibits very good statistical properties.  All this should go in the
>> next major release 3.13.  So, everyone with an interest in efficient
>> hash tables is welcome to try the trunk sources and let me know of any
>> problems encountered.*
>>
>> On Thu, Jan 17, 2013 at 10:35 AM, Jean-Baptiste Jeannin <
>> jeannin@cs.cornell.edu> wrote:
>>
>>> Hello,
>>>
>>> The default OCaml function (Hashtbl.hash) says that it "always
>>> terminates, even on cyclic structures.". I would be curious to know what
>>> its complexity is, both on a finite list and on a cyclic list (created by
>>> let rec l = 1 :: l for example). Is the algorithm that is being used
>>> published anywhere?
>>>
>>> Also, this hashing function seems to be returning 0 on any cyclic list
>>> (at least the ones I tried). Is this normal? Does anyone know any better
>>> hashing function on cyclic lists?
>>>
>>> # Hashtbl.hash [ 1 ; 2 ];;
>>> - : int = 131199
>>> # let rec ones = 1 :: ones;;
>>> val ones : int list =
>>>   [1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
>>> 1; 1;
>>>    1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
>>> 1; 1;
>>>    1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
>>> 1; 1;
>>>    1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
>>> 1; 1;
>>>    1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
>>> 1; 1;
>>>    1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
>>> 1; 1;
>>>    1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
>>> 1; 1;
>>>    1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
>>> 1; 1;
>>>    1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
>>> 1; 1;
>>>    1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
>>> 1; 1;
>>>    1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
>>> 1; 1;
>>>    1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
>>> 1;
>>>    ...]
>>> # Hashtbl.hash ones;;
>>> - : int = 0
>>> # Hashtbl.hash (5 :: 4 :: ones);;
>>> - : int = 0
>>>
>>> Thank you,
>>> Jean-Baptiste Jeannin
>>>
>>> --
>>> Caml-list mailing list.  Subscription management and archives:
>>> https://sympa.inria.fr/sympa/**arc/caml-list<https://sympa.inria.fr/sympa/arc/caml-list>
>>> Beginner's list: http://groups.yahoo.com/group/**ocaml_beginners<http://groups.yahoo.com/group/ocaml_beginners>
>>> Bug reports: http://caml.inria.fr/bin/caml-**bugs<http://caml.inria.fr/bin/caml-bugs>
>>>
>>
>>
>

[-- Attachment #2: Type: text/html, Size: 8961 bytes --]

  reply	other threads:[~2013-01-17 17:05 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-01-17 15:35 Jean-Baptiste Jeannin
2013-01-17 15:46 ` Nicholas Lucaroni
2013-01-17 16:41   ` Edgar Friendly
2013-01-17 17:05     ` Nicholas Lucaroni [this message]
2013-01-18 17:46       ` Jean-Baptiste Jeannin
2013-01-18 18:27         ` Gabriel Scherer
2013-01-18 20:11           ` Nick Lucaroni

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAB6W5F4-jfWgem1+f+khsbzD-b6ZG4+F8X7VYDjQM7ax8gTimw@mail.gmail.com \
    --to=nicholas.r.lucaroni@gmail.com \
    --cc=caml-list@inria.fr \
    --cc=jeannin@cs.cornell.edu \
    --cc=thelema314@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox