Mailing list for all users of the OCaml language and system.
 help / color / mirror / Atom feed
From: Jon Harrop <jonathandeanharrop@googlemail.com>
To: "'Michael Ekstrand'" <michael@elehack.net>, <caml-list@yquem.inria.fr>
Subject: RE: [Caml-list] How does OCaml update references when values are	moved by the GC?
Date: Sat, 30 Oct 2010 21:40:32 +0100	[thread overview]
Message-ID: <019501cb7872$b56856b0$20390410$@com> (raw)
In-Reply-To: <4CCC5802.4080909@elehack.net>

Hi Michael,

Thanks for the info. I stumbled upon Richard's excellent web page describing
the internals of OCaml and the write barrier but it does not describe how
the pointers actually get rewritten.

Cheers,
Jon.

> -----Original Message-----
> From: caml-list-bounces@yquem.inria.fr [mailto:caml-list-
> bounces@yquem.inria.fr] On Behalf Of Michael Ekstrand
> Sent: 30 October 2010 18:38
> To: caml-list@yquem.inria.fr
> Subject: Re: [Caml-list] How does OCaml update references when values
> are moved by the GC?
> 
> On 10/30/2010 12:15 PM, Jon Harrop wrote:
> > I was hoping for a little more detail, of course. :-)
> >
> > How is the mapping from old to new pointers stored? Does the GC
> rewrite all
> > of the thread-local stacks in series before allowing any of them to
> > continue?
> 
> I imagine so.
> 
> > Does the write barrier record pointers written into the major heap
> > so only those specific locations are rewritten at minor heap
> collections but
> > the entire major heap is rewritten upon compaction?
> 
> Yes.  The runtime maintains a 'remembered set', a list of pointers from
> the major heap back to the minor heap.  Maintaining this set is why
> mutable data can be expensive in OCaml - any time you store a pointer
> into a mutable field, the runtime must check whether the new link is
> from the major to the minor heap and update the refs list accordingly.
> Richard WM Jones has details here:
> 
> http://rwmj.wordpress.com/2009/08/08/ocaml-internals-part-5-garbage-
> collection/
> 
> > Can the GC distinguish
> > between an array of ints and an array of pointers at run-time in
> order to
> > avoid traversing all of the ints when trying to rewrite pointers?
> 
> Not that I know of.  The tag block does not have a documented reserved
> value to indicate that - there are values to indicate an unboxed float
> array, a string, and an array of opaque values, but not an integer
> array
> (unless the opaque value flag is set for integer arrays).
> 
> > Also, any idea what the maximum proportion of the running time of a
> program
> > is spent doing this rewriting? For example, if you fill a float-
> >float hash
> > table with values does updating the pointers in the major heap
> account for a
> > significant proportion of the total running time of the program?
> 
> In my data analysis jobs (which wind up allocating quite large heaps),
> the compactor almost never (detectably) runs.  Minor cycles and major
> slices are a much larger concern in my experience.  I work around that
> by increasing the minor heap size to decrease minor heap thrashing (my
> general rule of thumb is that I want each "work unit", whatever that
> may
> be, to fit in the minor heap).
> 
> It could well be that other applications will have different
> characteristics that trigger more compactions.  I cannot speak for
> those
> applications.  Further, when I have huge floating-point data
> structures,
> I'm usually using bigarrays (not because I choose them over arrays,
> typically, but because such code in my work frequently has to interact
> with BLAS or SPARSKIT at some point).
> 
> - Michael
> 
> _______________________________________________
> Caml-list mailing list. Subscription management:
> http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
> Archives: http://caml.inria.fr
> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
> Bug reports: http://caml.inria.fr/bin/caml-bugs


  reply	other threads:[~2010-10-30 20:40 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <418632253.26199.1288302511712.JavaMail.root@zmbs1.inria.fr>
2010-10-29  7:47 ` Damien Doligez
2010-10-30 11:14   ` Elias Gabriel Amaral da Silva
2010-10-30 17:15   ` Jon Harrop
2010-10-30 17:38     ` Michael Ekstrand
2010-10-30 20:40       ` Jon Harrop [this message]
2010-11-03  9:49         ` Richard Jones
2010-10-30 20:40   ` Jon Harrop
     [not found]   ` <1033013046.78315.1288458970997.JavaMail.root@zmbs4.inria.fr>
2010-11-02 10:25     ` Xavier Leroy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='019501cb7872$b56856b0$20390410$@com' \
    --to=jonathandeanharrop@googlemail.com \
    --cc=caml-list@yquem.inria.fr \
    --cc=michael@elehack.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox