From: Gerd Stolpmann <info@gerd-stolpmann.de>
To: Richard Jones <rich@annexia.org>
Cc: yoann padioleau <padator@wanadoo.fr>, caml-list@inria.fr
Subject: Re: [Caml-list] Memory usage/ garbage collection question
Date: Fri, 14 Oct 2005 12:07:06 +0200 [thread overview]
Message-ID: <1129284426.12434.110.camel@localhost.localdomain> (raw)
In-Reply-To: <20051014101018.GA13302@furbychan.cocan.org>
Am Freitag, den 14.10.2005, 11:10 +0100 schrieb Richard Jones:
> On Fri, Oct 14, 2005 at 11:36:57AM +0200, yoann padioleau wrote:
> > > List.iter (
> > > fun row ->
> > > (* put row into database and forget about it *)
> > > ) rows;
> > > (* no further references to rows after this *)
> >
> > Because rows is still accessible after the List.iter so it is normal
> > that it is not garbage collected.
>
> I agree that rows is "accessible", but it's not actually used. My
> understanding is that the GC would be prevented from considering the
> list for collection if the pointer to the head of the list (ie. rows)
> was stored on the heap or in a register somewhere. Would this be the
> case here?
>
> > I had the same kind of problem and to optimize it I choose to
> > produce the elements of rows lazily (but then I had another problem
> > with the Lazy modudle where elements were not garbage collected so I
> > use my own lazy module (simple via closure) and it works perfectly
> > well).
>
> Unfortunately this isn't really an option here. The rows list comes
> from a huge XML doc which is parsed by PXP and passed through some
> complex post-processing; PXP doesn't support incremental processing of
> XML docs, and the post-processing would be tricky to convert too.
PXP has a pull parser. You get the XML document as a lazy stream of XML
events. I don't know your document format, but if it is something like
<document>
<record>...</record>
<record>...</record>
... lots of them ...
</document>
I would recommend using the pull parser, and then create XML trees for
the individual records only (you can mix both styles).
Gerd
--
------------------------------------------------------------
Gerd Stolpmann * Viktoriastr. 45 * 64293 Darmstadt * Germany
gerd@gerd-stolpmann.de http://www.gerd-stolpmann.de
Telefon: 06151/153855 Telefax: 06151/997714
------------------------------------------------------------
next prev parent reply other threads:[~2005-10-14 10:07 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2005-10-14 9:36 yoann padioleau
2005-10-14 10:10 ` Richard Jones
2005-10-14 10:07 ` Gerd Stolpmann [this message]
2005-10-14 9:49 Richard Jones
2005-10-14 10:02 ` [Caml-list] " skaller
2005-10-14 10:08 ` Olivier Andrieu
[not found] ` <c7ee61120510140258q5b7f393l8e3c2c3d45f49008@mail.gmail.com>
2005-10-14 10:27 ` Richard Jones
2005-10-14 10:51 ` Frederic van der Plancke
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1129284426.12434.110.camel@localhost.localdomain \
--to=info@gerd-stolpmann.de \
--cc=caml-list@inria.fr \
--cc=padator@wanadoo.fr \
--cc=rich@annexia.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox