* Memory statistics tool @ 2008-07-23 10:54 Dr. Thomas Fischbacher 2008-07-23 11:47 ` [Caml-list] " Daniel Bünzli 2008-07-23 12:44 ` dmitry grebeniuk 0 siblings, 2 replies; 10+ messages in thread From: Dr. Thomas Fischbacher @ 2008-07-23 10:54 UTC (permalink / raw) To: Caml-list List Dear OCaml folks, when building large applications that work on complicated and highly networked data, one issue that easily comes up is to get some idea about what chunks of data eat all your memory. Now, it would be marvellous for data structure optimization purposes if there were a function memory_footprint: 'a -> int64 (or maybe float), which takes as argument a root (e.g. Obj.magic [|Obj.magic firstthingy; Obj.magic secondthingy; Obj.magic thirdthingy|]) and tells me how many cells are occupied by those ML data structures reachable from that root. Basically, this would correspond to using the GC's traversal mechanism and doing some internal statistics at the same time. My guess would be that the Marshal module "almost" has such a function already, to determine the amount of memory required to hold a string-serialized value. But as these values get compacted, the length of the string does not correspond to the number of words occupied by the in-memory data. Is there already something like that? Has anyone already built such a tool? -- best regards, Thomas Fischbacher t.fischbacher@soton.ac.uk ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Caml-list] Memory statistics tool 2008-07-23 10:54 Memory statistics tool Dr. Thomas Fischbacher @ 2008-07-23 11:47 ` Daniel Bünzli 2008-07-23 12:40 ` Jan Kybic 2008-07-23 12:44 ` dmitry grebeniuk 1 sibling, 1 reply; 10+ messages in thread From: Daniel Bünzli @ 2008-07-23 11:47 UTC (permalink / raw) To: Caml-list List Le 23 juil. 08 à 12:54, Dr. Thomas Fischbacher a écrit : > Is there already something like that? Has anyone already built such > a tool? Also had this wish the other day, I found objsize [1] but didn't use it -- did a rough approximation by traversing the datastructure. A generic implementation using only the Obj module and a lookup table to track visited nodes would be nice but I forgot too much about all the cases in the representation of caml values to implement it quickly and correctly. Daniel [1] http://caml.inria.fr/cgi-bin/hump.fr.cgi?contrib=614 ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Caml-list] Memory statistics tool 2008-07-23 11:47 ` [Caml-list] " Daniel Bünzli @ 2008-07-23 12:40 ` Jan Kybic 0 siblings, 0 replies; 10+ messages in thread From: Jan Kybic @ 2008-07-23 12:40 UTC (permalink / raw) To: Caml-list List >> Is there already something like that? Has anyone already built such >> a tool? > > Also had this wish the other day, I found objsize [1] but didn't use > it -- did a rough approximation by traversing the datastructure. A I have been using Size by Jean-Christophe Filliatre. It worked fine for me. http://www.lri.fr/~filliatr/ftp/ocaml/ds/ Jan -- ------------------------------------------------------------------------- Jan Kybic <kybic@fel.cvut.cz> tel. +420 2 2435 5721 http://cmp.felk.cvut.cz/~kybic ICQ 200569450 ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Caml-list] Memory statistics tool 2008-07-23 10:54 Memory statistics tool Dr. Thomas Fischbacher 2008-07-23 11:47 ` [Caml-list] " Daniel Bünzli @ 2008-07-23 12:44 ` dmitry grebeniuk 2008-07-23 13:09 ` Dr. Thomas Fischbacher 1 sibling, 1 reply; 10+ messages in thread From: dmitry grebeniuk @ 2008-07-23 12:44 UTC (permalink / raw) To: caml-list Hello. DTF> memory_footprint: 'a -> int64 (or maybe float), objsize, now hosted on OCaml forge: http://forge.ocamlcore.org/projects/objsize/ -- WBR, dmitry mailto:gds-mlsts@moldavcable.com ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Caml-list] Memory statistics tool 2008-07-23 12:44 ` dmitry grebeniuk @ 2008-07-23 13:09 ` Dr. Thomas Fischbacher 2008-07-23 13:16 ` Alain Frisch 0 siblings, 1 reply; 10+ messages in thread From: Dr. Thomas Fischbacher @ 2008-07-23 13:09 UTC (permalink / raw) To: dmitry grebeniuk; +Cc: caml-list dmitry grebeniuk wrote: > DTF> memory_footprint: 'a -> int64 (or maybe float), > > objsize, now hosted on OCaml forge: > http://forge.ocamlcore.org/projects/objsize/ Many thanks! I just had a glance at it, but it seems to be just how one would have to approach such a problem. (The issue with hash-based approaches to find previously visited substructures is that during traversal, a GC may occur. Now I just assume that this may involve relocation and heap compaction in OCaml. The problem then is that OCaml does not properly support what would be known as eq hash tables in Lisp.) -- best regards, Thomas Fischbacher t.fischbacher@soton.ac.uk ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Caml-list] Memory statistics tool 2008-07-23 13:09 ` Dr. Thomas Fischbacher @ 2008-07-23 13:16 ` Alain Frisch 2008-07-24 12:48 ` Dr. Thomas Fischbacher 0 siblings, 1 reply; 10+ messages in thread From: Alain Frisch @ 2008-07-23 13:16 UTC (permalink / raw) To: Dr. Thomas Fischbacher; +Cc: dmitry grebeniuk, caml-list > Many thanks! I just had a glance at it, but it seems to be just how one > would have to approach such a problem. (The issue with hash-based > approaches to find previously visited substructures is that during > traversal, a GC may occur. Now I just assume that this may involve > relocation and heap compaction in OCaml. The problem then is that > OCaml does not properly support what would be known as eq hash tables > in Lisp.) As long as the data structure supports the polymorphic hash function, it should work to simply use a regular hash table with the polymorphic hash function and physical equality, as in: module S = Hashtbl.Make(struct type t = Obj.t let hash = Hashtbl.hash let equal = (==) end);; (Of course, this might be quite slow.) -- Alain ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Caml-list] Memory statistics tool 2008-07-23 13:16 ` Alain Frisch @ 2008-07-24 12:48 ` Dr. Thomas Fischbacher 2008-07-24 15:14 ` Alain Frisch 0 siblings, 1 reply; 10+ messages in thread From: Dr. Thomas Fischbacher @ 2008-07-24 12:48 UTC (permalink / raw) To: Alain Frisch; +Cc: dmitry grebeniuk, caml-list Alain Frisch wrote: >>Many thanks! I just had a glance at it, but it seems to be just how one >>would have to approach such a problem. (The issue with hash-based >>approaches to find previously visited substructures is that during >>traversal, a GC may occur. Now I just assume that this may involve >>relocation and heap compaction in OCaml. The problem then is that >>OCaml does not properly support what would be known as eq hash tables >>in Lisp.) > > > As long as the data structure supports the polymorphic hash function, it > should work to simply use a regular hash table with the polymorphic hash > function and physical equality, as in: > > module S = Hashtbl.Make(struct > type t = Obj.t > let hash = Hashtbl.hash > let equal = (==) > end);; Why? (I.e. I'm not convinced yet.) -- best regards, Thomas Fischbacher t.fischbacher@soton.ac.uk ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Caml-list] Memory statistics tool 2008-07-24 12:48 ` Dr. Thomas Fischbacher @ 2008-07-24 15:14 ` Alain Frisch 2008-07-24 15:44 ` Dr. Thomas Fischbacher 0 siblings, 1 reply; 10+ messages in thread From: Alain Frisch @ 2008-07-24 15:14 UTC (permalink / raw) To: Dr. Thomas Fischbacher; +Cc: dmitry grebeniuk, caml-list Dr. Thomas Fischbacher wrote: > Alain Frisch wrote: >> As long as the data structure supports the polymorphic hash function, it >> should work to simply use a regular hash table with the polymorphic hash >> function and physical equality, as in: >> >> module S = Hashtbl.Make(struct >> type t = Obj.t >> let hash = Hashtbl.hash >> let equal = (==) >> end);; > > Why? (I.e. I'm not convinced yet.) The two functions (hash and equal) are invariant w.r.t. changes of physical memory location of their arguments. -- Alain ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Caml-list] Memory statistics tool 2008-07-24 15:14 ` Alain Frisch @ 2008-07-24 15:44 ` Dr. Thomas Fischbacher 2008-07-24 16:12 ` Alain Frisch 0 siblings, 1 reply; 10+ messages in thread From: Dr. Thomas Fischbacher @ 2008-07-24 15:44 UTC (permalink / raw) To: Alain Frisch; +Cc: dmitry grebeniuk, caml-list Alain Frisch wrote: >>>As long as the data structure supports the polymorphic hash function, it >>>should work to simply use a regular hash table with the polymorphic hash >>>function and physical equality, as in: >>> >>>module S = Hashtbl.Make(struct >>> type t = Obj.t >>> let hash = Hashtbl.hash >>> let equal = (==) >>>end);; >> >>Why? (I.e. I'm not convinced yet.) > > > The two functions (hash and equal) are invariant w.r.t. changes of > physical memory location of their arguments. The OCaml manual gives no guarantee that Hashtbl.hash does not cons, so I cannot assume this. Now, without that guarantee, there is a nasty race condition in which the determination of the hash bucket causes objects to move in memory. But still, we are safe, as we are just testing for equality, and the hash bucket does not depend on the memory address, but on the substructure of the hashed entity. So, ok, you convinced me. Anyway, it works now -- thanks to Dmitry's code, I can now do things like...: tf@alpha:~/ocaml$ nsim_i In [1]: ocaml.memory_footprint(ocaml.make_element("E",[3],3,1)) Out[1]: (154.0, 49.0, 5.0) In [2]: ...and use the interactive Python toplevel of our micromagnetic simulator "nmag" to find out how much memory is used by the OCaml data structures under the hood. Excellent. Thanks, Dmitry! -- best regards, Thomas Fischbacher t.fischbacher@soton.ac.uk ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Caml-list] Memory statistics tool 2008-07-24 15:44 ` Dr. Thomas Fischbacher @ 2008-07-24 16:12 ` Alain Frisch 0 siblings, 0 replies; 10+ messages in thread From: Alain Frisch @ 2008-07-24 16:12 UTC (permalink / raw) To: Dr. Thomas Fischbacher; +Cc: dmitry grebeniuk, caml-list Dr. Thomas Fischbacher wrote: > The OCaml manual gives no guarantee that Hashtbl.hash does not cons, so > I cannot assume this. Indeed, Hashtbl.hash can cons, but this does not contradict my point: its result does not depend on the physical location of objects in memory (if it did, it would be impossible to use this function at all). > Now, without that guarantee, there is a nasty race > condition in which the determination of the hash bucket causes objects > to move in memory. Yes, objects can move in memory, but what is wrong with that? Their new hash value will remain the same. -- Alain ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2008-07-24 16:12 UTC | newest] Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2008-07-23 10:54 Memory statistics tool Dr. Thomas Fischbacher 2008-07-23 11:47 ` [Caml-list] " Daniel Bünzli 2008-07-23 12:40 ` Jan Kybic 2008-07-23 12:44 ` dmitry grebeniuk 2008-07-23 13:09 ` Dr. Thomas Fischbacher 2008-07-23 13:16 ` Alain Frisch 2008-07-24 12:48 ` Dr. Thomas Fischbacher 2008-07-24 15:14 ` Alain Frisch 2008-07-24 15:44 ` Dr. Thomas Fischbacher 2008-07-24 16:12 ` Alain Frisch
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox