* Re: ancient module [not found] <7366F08F-88A4-40BA-95EE-1E682BEDBEFA@facebook.com> @ 2010-09-14 20:46 ` Richard Jones 2010-09-14 20:48 ` [Caml-list] " Richard Jones 2010-09-20 18:52 ` Gerd Stolpmann 0 siblings, 2 replies; 4+ messages in thread From: Richard Jones @ 2010-09-14 20:46 UTC (permalink / raw) To: Yoann Padioleau; +Cc: caml-list On Tue, Sep 14, 2010 at 08:19:49PM +0000, Yoann Padioleau wrote: > Hi, > > I am trying to use your Ancient module to avoid having the garbage > collector spends lots of time iterating over huge data in memory. It > works quite well for arrays but for hashtbl I have some problems > where I am not able to find back keys that were clearly in the > original hashtbl (before Ancient.mark it). > > In the doc it says: > > (1) Ad-hoc polymorphic primitives (structural equality, marshalling > and hashing) do not work on ancient data structures, meaning that you > will need to provide your own comparison and hashing functions. The issue is described by Xavier Leroy: http://caml.inria.fr/pub/ml-archives/caml-list/2006/09/977818689f4ceb2178c592453df7a343.en.html As far as my understanding goes, what happens is that the OCaml compare function (or some C equivalent in the runtime) looks at the two string pointers and decides that since both are out of the normal heap they are just opaque objects. Thus it won't compare the content of the strings, but will just do pointer equality. This massively breaks assumptions in some ordinary OCaml code, in this instance in Hashtbl. > which mean I have to transform my code using Hashtbl.xxx into one > using the functorized version of hashtbl ? I have hashtbl of strings > to complex data type. What would be a good hash function for > strings ? It may be that Map also has the same problems. You wouldn't really know except by examining the code. Later you wrote: > Actually it seems I have the problem only with Hashtbl from strings > to whatever. I also have some Hashtbl from int to whatever and they > work fine after the Ancient.mark. ints aren't compared in the same way. They are always compared using pointer equality, so there's no issue. I've only used ancient to store simple arrays, and when we needed to do string equality I remember writing a function which was aware of the above issue (you can compare them byte for byte just fine, even from OCaml code). Rich. -- Richard Jones Red Hat ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [Caml-list] Re: ancient module 2010-09-14 20:46 ` ancient module Richard Jones @ 2010-09-14 20:48 ` Richard Jones 2010-09-15 7:41 ` Erkki Seppala 2010-09-20 18:52 ` Gerd Stolpmann 1 sibling, 1 reply; 4+ messages in thread From: Richard Jones @ 2010-09-14 20:48 UTC (permalink / raw) To: Yoann Padioleau; +Cc: caml-list On Tue, Sep 14, 2010 at 09:46:24PM +0100, Richard Jones wrote: > I've only used ancient to store simple arrays, and when we needed to > do string equality I remember writing a function which was aware of > the above issue (you can compare them byte for byte just fine, even > from OCaml code). Answering my own question, I guess you can use Map, but write a custom string comparison function. Ought to work but not tested it :-) Rich. -- Richard Jones Red Hat ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [Caml-list] Re: ancient module 2010-09-14 20:48 ` [Caml-list] " Richard Jones @ 2010-09-15 7:41 ` Erkki Seppala 0 siblings, 0 replies; 4+ messages in thread From: Erkki Seppala @ 2010-09-15 7:41 UTC (permalink / raw) To: caml-list Richard Jones <rich@annexia.org> writes: > On Tue, Sep 14, 2010 at 09:46:24PM +0100, Richard Jones wrote: > Answering my own question, I guess you can use Map, but write a custom > string comparison function. Ought to work but not tested it :-) And in similar fashion, one could use Hashtbl.Make to construct a custom hash table with Hashtbl.HashedType, but provide a custom hashing (and comparison) function. I assume also the default hashing function stops upon finding data that's outside O'Caml heap. Also, the compiler recognizes when strings are compared and calls the comparing function directly. So let cmp (a : string) b = a < b produces a call directly to caml_string_lessthan, which I assume would not make any special checks. -- _____________________________________________________________________ / __// /__ ____ __ http://www.modeemi.fi/~flux/\ \ / /_ / // // /\ \/ / \ / /_/ /_/ \___/ /_/\_\@modeemi.fi \/ ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [Caml-list] Re: ancient module 2010-09-14 20:46 ` ancient module Richard Jones 2010-09-14 20:48 ` [Caml-list] " Richard Jones @ 2010-09-20 18:52 ` Gerd Stolpmann 1 sibling, 0 replies; 4+ messages in thread From: Gerd Stolpmann @ 2010-09-20 18:52 UTC (permalink / raw) To: Richard Jones; +Cc: Yoann Padioleau, caml-list Am Dienstag, den 14.09.2010, 21:46 +0100 schrieb Richard Jones: > On Tue, Sep 14, 2010 at 08:19:49PM +0000, Yoann Padioleau wrote: > > Hi, > > > > I am trying to use your Ancient module to avoid having the garbage > > collector spends lots of time iterating over huge data in memory. It > > works quite well for arrays but for hashtbl I have some problems > > where I am not able to find back keys that were clearly in the > > original hashtbl (before Ancient.mark it). > > > > In the doc it says: > > > > (1) Ad-hoc polymorphic primitives (structural equality, marshalling > > and hashing) do not work on ancient data structures, meaning that you > > will need to provide your own comparison and hashing functions. > > The issue is described by Xavier Leroy: > http://caml.inria.fr/pub/ml-archives/caml-list/2006/09/977818689f4ceb2178c592453df7a343.en.html > > As far as my understanding goes, what happens is that the OCaml > compare function (or some C equivalent in the runtime) looks at the > two string pointers and decides that since both are out of the normal > heap they are just opaque objects. Thus it won't compare the content > of the strings, but will just do pointer equality. This massively > breaks assumptions in some ordinary OCaml code, in this instance in > Hashtbl. There is now a way to change this. You can call caml_page_table_add (since 3.11) to explicitly declare a memory region as containing Ocaml values. The polymorphic comparison, the hash primitive, and marshalling work then. There is support for this in Ocamlnet-3: http://projects.camlcity.org/projects/dl/ocamlnet-3.0.3/doc/html-main/Netsys_mem.html#VALvalue_area Gerd > > > which mean I have to transform my code using Hashtbl.xxx into one > > using the functorized version of hashtbl ? I have hashtbl of strings > > to complex data type. What would be a good hash function for > > strings ? > > It may be that Map also has the same problems. You wouldn't really > know except by examining the code. > > Later you wrote: > > Actually it seems I have the problem only with Hashtbl from strings > > to whatever. I also have some Hashtbl from int to whatever and they > > work fine after the Ancient.mark. > > ints aren't compared in the same way. They are always compared using > pointer equality, so there's no issue. > > I've only used ancient to store simple arrays, and when we needed to > do string equality I remember writing a function which was aware of > the above issue (you can compare them byte for byte just fine, even > from OCaml code). > > Rich. > -- ------------------------------------------------------------ Gerd Stolpmann, Bad Nauheimer Str.3, 64289 Darmstadt,Germany gerd@gerd-stolpmann.de http://www.gerd-stolpmann.de Phone: +49-6151-153855 Fax: +49-6151-997714 ------------------------------------------------------------ ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2010-09-20 18:53 UTC | newest] Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- [not found] <7366F08F-88A4-40BA-95EE-1E682BEDBEFA@facebook.com> 2010-09-14 20:46 ` ancient module Richard Jones 2010-09-14 20:48 ` [Caml-list] " Richard Jones 2010-09-15 7:41 ` Erkki Seppala 2010-09-20 18:52 ` Gerd Stolpmann
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox