Mailing list for all users of the OCaml language and system.
 help / color / mirror / Atom feed
From: Nicolas Braud-Santoni <nicolas.braudsantoni@gmail.com>
To: caml-list@inria.fr
Subject: Re: [Caml-list] Hardening [Perl's] hash function further
Date: Tue, 19 Nov 2013 23:31:30 +0100	[thread overview]
Message-ID: <528BE6C2.4000703@gmail.com> (raw)
In-Reply-To: <1384859953.62343.YahooMailNeo@web120403.mail.ne1.yahoo.com>

[-- Attachment #1: Type: text/plain, Size: 2413 bytes --]

On 19/11/2013 12:19, Dario Teixeira wrote:
> Just to make sure we are all on the same page, allow me to summarise
> the main points under discussion.  I think we all agree on the following:
>
>  - Any hashtable implementation that uses a non-cryptographic hash function
>    and relies on an association list of buckets is vulnerable to attacks
>    that force the worst-case O(n) behaviour.  Though you can buy some time
>    by tweaking the hash function, you are still stuck in an arms race with
>    attackers.
Depends on what you mean by “non-cryptographic”.
But not using a PRF (a strong hash function) seems to be unpractical.

>  - Solution approach #1: switch to a cryptographic hash function.  This
>    approach is simple and would solve the problem once and for all.
>    Unfortunately, the performance would be terrible (how much though?).
As I stated in my previous mail, some recent hash function have good
speed (without relying on fancy superscalar instructions).
Moreover, it is unclear how time is spent inside `caml_hash`.

However, changing the hash function isn't so simple, as :
- Hashtbl's documentation[1] specifies that “in non-randomized mode, the
order in which the bindings are enumerated is reproducible between [...]
minor versions of OCaml”.
This means the change cannot be made (for the non-randomized mode) until
OCaml 5.
- Implementing a hash function in a form suitable for `byterun/hash.c`
requires
- to care about endiannes ? (though the hash function doesn't seem to be
specified to be architecture-independent, it is currently the case[2])
- to be able to feed the data to the hash function in small “chunks”
while traversing the OCaml value.

[1] http://caml.inria.fr/pub/docs/manual-ocaml/libref/Hashtbl.html
[2] Except for 64-bit values that cannot be represented in 32-bit, of course
>  - Solution approach #2: keep the non-cryptographic hash function but use
>    a balanced tree for the buckets instead of an association list. [...] performance is most likely better than the first approach.
Again, I'm wary of “hand waving” statements about performance in this case.

>  - Since this problem presents itself on limited domains, [...] if the adopted solution does
>    indeed have a serious performance penalty, then it should be opt-in.
If the performance penalty is prohibitive, this seems clear, yes.


Best,
Nicolas


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

  parent reply	other threads:[~2013-11-19 22:31 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-11-18 20:44 Richard W.M. Jones
2013-11-19  0:08 ` Gerd Stolpmann
2013-11-19  7:53   ` David MENTRE
2013-11-19  8:50     ` Richard W.M. Jones
2013-11-19  9:14     ` Gabriel Scherer
2013-11-19 11:19       ` Dario Teixeira
2013-11-19 12:55         ` rixed
2013-11-19 22:18           ` Nicolas Braud-Santoni
2013-11-19 22:39             ` Eric Cooper
2013-11-19 22:55               ` Nicolas Braud-Santoni
2013-11-25 13:46                 ` Goswin von Brederlow
2013-11-19 22:31         ` Nicolas Braud-Santoni [this message]
2013-11-20 18:56         ` Florian Weimer
2013-11-20 21:47           ` Gerd Stolpmann
2013-11-25 13:51             ` Goswin von Brederlow
2013-11-25 14:43               ` Yaron Minsky
2013-11-19 22:15     ` Nicolas Braud-Santoni
2013-11-25 13:38   ` Goswin von Brederlow

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=528BE6C2.4000703@gmail.com \
    --to=nicolas.braudsantoni@gmail.com \
    --cc=caml-list@inria.fr \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox