* Memory statistics tool
@ 2008-07-23 10:54 Dr. Thomas Fischbacher
2008-07-23 11:47 ` [Caml-list] " Daniel Bünzli
2008-07-23 12:44 ` dmitry grebeniuk
0 siblings, 2 replies; 10+ messages in thread
From: Dr. Thomas Fischbacher @ 2008-07-23 10:54 UTC (permalink / raw)
To: Caml-list List
Dear OCaml folks,
when building large applications that work on complicated and highly
networked data, one issue that easily comes up is to get some idea
about what chunks of data eat all your memory. Now, it would be
marvellous for data structure optimization purposes if there were a
function
memory_footprint: 'a -> int64 (or maybe float),
which takes as argument a root
(e.g. Obj.magic [|Obj.magic firstthingy; Obj.magic secondthingy;
Obj.magic thirdthingy|])
and tells me how many cells are occupied by those ML data structures
reachable from that root. Basically, this would correspond to using
the GC's traversal mechanism and doing some internal statistics at the
same time. My guess would be that the Marshal module "almost" has such
a function already, to determine the amount of memory required to hold
a string-serialized value. But as these values get compacted, the length
of the string does not correspond to the number of words occupied by the
in-memory data.
Is there already something like that? Has anyone already built such
a tool?
--
best regards,
Thomas Fischbacher
t.fischbacher@soton.ac.uk
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Caml-list] Memory statistics tool
2008-07-23 10:54 Memory statistics tool Dr. Thomas Fischbacher
@ 2008-07-23 11:47 ` Daniel Bünzli
2008-07-23 12:40 ` Jan Kybic
2008-07-23 12:44 ` dmitry grebeniuk
1 sibling, 1 reply; 10+ messages in thread
From: Daniel Bünzli @ 2008-07-23 11:47 UTC (permalink / raw)
To: Caml-list List
Le 23 juil. 08 à 12:54, Dr. Thomas Fischbacher a écrit :
> Is there already something like that? Has anyone already built such
> a tool?
Also had this wish the other day, I found objsize [1] but didn't use
it -- did a rough approximation by traversing the datastructure. A
generic implementation using only the Obj module and a lookup table to
track visited nodes would be nice but I forgot too much about all the
cases in the representation of caml values to implement it quickly and
correctly.
Daniel
[1] http://caml.inria.fr/cgi-bin/hump.fr.cgi?contrib=614
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Caml-list] Memory statistics tool
2008-07-23 11:47 ` [Caml-list] " Daniel Bünzli
@ 2008-07-23 12:40 ` Jan Kybic
0 siblings, 0 replies; 10+ messages in thread
From: Jan Kybic @ 2008-07-23 12:40 UTC (permalink / raw)
To: Caml-list List
>> Is there already something like that? Has anyone already built such
>> a tool?
>
> Also had this wish the other day, I found objsize [1] but didn't use
> it -- did a rough approximation by traversing the datastructure. A
I have been using Size by Jean-Christophe Filliatre. It worked fine
for me.
http://www.lri.fr/~filliatr/ftp/ocaml/ds/
Jan
--
-------------------------------------------------------------------------
Jan Kybic <kybic@fel.cvut.cz> tel. +420 2 2435 5721
http://cmp.felk.cvut.cz/~kybic ICQ 200569450
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Caml-list] Memory statistics tool
2008-07-23 10:54 Memory statistics tool Dr. Thomas Fischbacher
2008-07-23 11:47 ` [Caml-list] " Daniel Bünzli
@ 2008-07-23 12:44 ` dmitry grebeniuk
2008-07-23 13:09 ` Dr. Thomas Fischbacher
1 sibling, 1 reply; 10+ messages in thread
From: dmitry grebeniuk @ 2008-07-23 12:44 UTC (permalink / raw)
To: caml-list
Hello.
DTF> memory_footprint: 'a -> int64 (or maybe float),
objsize, now hosted on OCaml forge:
http://forge.ocamlcore.org/projects/objsize/
--
WBR,
dmitry mailto:gds-mlsts@moldavcable.com
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Caml-list] Memory statistics tool
2008-07-23 12:44 ` dmitry grebeniuk
@ 2008-07-23 13:09 ` Dr. Thomas Fischbacher
2008-07-23 13:16 ` Alain Frisch
0 siblings, 1 reply; 10+ messages in thread
From: Dr. Thomas Fischbacher @ 2008-07-23 13:09 UTC (permalink / raw)
To: dmitry grebeniuk; +Cc: caml-list
dmitry grebeniuk wrote:
> DTF> memory_footprint: 'a -> int64 (or maybe float),
>
> objsize, now hosted on OCaml forge:
> http://forge.ocamlcore.org/projects/objsize/
Many thanks! I just had a glance at it, but it seems to be just how one
would have to approach such a problem. (The issue with hash-based
approaches to find previously visited substructures is that during
traversal, a GC may occur. Now I just assume that this may involve
relocation and heap compaction in OCaml. The problem then is that
OCaml does not properly support what would be known as eq hash tables
in Lisp.)
--
best regards,
Thomas Fischbacher
t.fischbacher@soton.ac.uk
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Caml-list] Memory statistics tool
2008-07-23 13:09 ` Dr. Thomas Fischbacher
@ 2008-07-23 13:16 ` Alain Frisch
2008-07-24 12:48 ` Dr. Thomas Fischbacher
0 siblings, 1 reply; 10+ messages in thread
From: Alain Frisch @ 2008-07-23 13:16 UTC (permalink / raw)
To: Dr. Thomas Fischbacher; +Cc: dmitry grebeniuk, caml-list
> Many thanks! I just had a glance at it, but it seems to be just how one
> would have to approach such a problem. (The issue with hash-based
> approaches to find previously visited substructures is that during
> traversal, a GC may occur. Now I just assume that this may involve
> relocation and heap compaction in OCaml. The problem then is that
> OCaml does not properly support what would be known as eq hash tables
> in Lisp.)
As long as the data structure supports the polymorphic hash function, it
should work to simply use a regular hash table with the polymorphic hash
function and physical equality, as in:
module S = Hashtbl.Make(struct
type t = Obj.t
let hash = Hashtbl.hash
let equal = (==)
end);;
(Of course, this might be quite slow.)
-- Alain
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Caml-list] Memory statistics tool
2008-07-23 13:16 ` Alain Frisch
@ 2008-07-24 12:48 ` Dr. Thomas Fischbacher
2008-07-24 15:14 ` Alain Frisch
0 siblings, 1 reply; 10+ messages in thread
From: Dr. Thomas Fischbacher @ 2008-07-24 12:48 UTC (permalink / raw)
To: Alain Frisch; +Cc: dmitry grebeniuk, caml-list
Alain Frisch wrote:
>>Many thanks! I just had a glance at it, but it seems to be just how one
>>would have to approach such a problem. (The issue with hash-based
>>approaches to find previously visited substructures is that during
>>traversal, a GC may occur. Now I just assume that this may involve
>>relocation and heap compaction in OCaml. The problem then is that
>>OCaml does not properly support what would be known as eq hash tables
>>in Lisp.)
>
>
> As long as the data structure supports the polymorphic hash function, it
> should work to simply use a regular hash table with the polymorphic hash
> function and physical equality, as in:
>
> module S = Hashtbl.Make(struct
> type t = Obj.t
> let hash = Hashtbl.hash
> let equal = (==)
> end);;
Why? (I.e. I'm not convinced yet.)
--
best regards,
Thomas Fischbacher
t.fischbacher@soton.ac.uk
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Caml-list] Memory statistics tool
2008-07-24 12:48 ` Dr. Thomas Fischbacher
@ 2008-07-24 15:14 ` Alain Frisch
2008-07-24 15:44 ` Dr. Thomas Fischbacher
0 siblings, 1 reply; 10+ messages in thread
From: Alain Frisch @ 2008-07-24 15:14 UTC (permalink / raw)
To: Dr. Thomas Fischbacher; +Cc: dmitry grebeniuk, caml-list
Dr. Thomas Fischbacher wrote:
> Alain Frisch wrote:
>> As long as the data structure supports the polymorphic hash function, it
>> should work to simply use a regular hash table with the polymorphic hash
>> function and physical equality, as in:
>>
>> module S = Hashtbl.Make(struct
>> type t = Obj.t
>> let hash = Hashtbl.hash
>> let equal = (==)
>> end);;
>
> Why? (I.e. I'm not convinced yet.)
The two functions (hash and equal) are invariant w.r.t. changes of
physical memory location of their arguments.
-- Alain
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Caml-list] Memory statistics tool
2008-07-24 15:14 ` Alain Frisch
@ 2008-07-24 15:44 ` Dr. Thomas Fischbacher
2008-07-24 16:12 ` Alain Frisch
0 siblings, 1 reply; 10+ messages in thread
From: Dr. Thomas Fischbacher @ 2008-07-24 15:44 UTC (permalink / raw)
To: Alain Frisch; +Cc: dmitry grebeniuk, caml-list
Alain Frisch wrote:
>>>As long as the data structure supports the polymorphic hash function, it
>>>should work to simply use a regular hash table with the polymorphic hash
>>>function and physical equality, as in:
>>>
>>>module S = Hashtbl.Make(struct
>>> type t = Obj.t
>>> let hash = Hashtbl.hash
>>> let equal = (==)
>>>end);;
>>
>>Why? (I.e. I'm not convinced yet.)
>
>
> The two functions (hash and equal) are invariant w.r.t. changes of
> physical memory location of their arguments.
The OCaml manual gives no guarantee that Hashtbl.hash does not cons, so
I cannot assume this. Now, without that guarantee, there is a nasty race
condition in which the determination of the hash bucket causes objects
to move in memory. But still, we are safe, as we are just testing for
equality, and the hash bucket does not depend on the memory address,
but on the substructure of the hashed entity.
So, ok, you convinced me.
Anyway, it works now -- thanks to Dmitry's code, I can now do
things like...:
tf@alpha:~/ocaml$ nsim_i
In [1]: ocaml.memory_footprint(ocaml.make_element("E",[3],3,1))
Out[1]: (154.0, 49.0, 5.0)
In [2]:
...and use the interactive Python toplevel of our micromagnetic
simulator "nmag" to find out how much memory is used by the OCaml
data structures under the hood. Excellent. Thanks, Dmitry!
--
best regards,
Thomas Fischbacher
t.fischbacher@soton.ac.uk
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Caml-list] Memory statistics tool
2008-07-24 15:44 ` Dr. Thomas Fischbacher
@ 2008-07-24 16:12 ` Alain Frisch
0 siblings, 0 replies; 10+ messages in thread
From: Alain Frisch @ 2008-07-24 16:12 UTC (permalink / raw)
To: Dr. Thomas Fischbacher; +Cc: dmitry grebeniuk, caml-list
Dr. Thomas Fischbacher wrote:
> The OCaml manual gives no guarantee that Hashtbl.hash does not cons, so
> I cannot assume this.
Indeed, Hashtbl.hash can cons, but this does not contradict my point:
its result does not depend on the physical location of objects in memory
(if it did, it would be impossible to use this function at all).
> Now, without that guarantee, there is a nasty race
> condition in which the determination of the hash bucket causes objects
> to move in memory.
Yes, objects can move in memory, but what is wrong with that? Their new
hash value will remain the same.
-- Alain
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2008-07-24 16:12 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-07-23 10:54 Memory statistics tool Dr. Thomas Fischbacher
2008-07-23 11:47 ` [Caml-list] " Daniel Bünzli
2008-07-23 12:40 ` Jan Kybic
2008-07-23 12:44 ` dmitry grebeniuk
2008-07-23 13:09 ` Dr. Thomas Fischbacher
2008-07-23 13:16 ` Alain Frisch
2008-07-24 12:48 ` Dr. Thomas Fischbacher
2008-07-24 15:14 ` Alain Frisch
2008-07-24 15:44 ` Dr. Thomas Fischbacher
2008-07-24 16:12 ` Alain Frisch
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox