From: Gerd Stolpmann <info@gerd-stolpmann.de>
To: "Alexander V. Voinov" <avv@quasar.ipa.nw.ru>
Cc: "caml-list@inria.fr" <caml-list@inria.fr>
Subject: Re: [Caml-list] ocaml-3.05: a performance experience
Date: Sat, 3 Aug 2002 14:33:11 +0200 [thread overview]
Message-ID: <20020803123311.GA631@ice.gerd-stolpmann.de> (raw)
In-Reply-To: <3D49FD72.68388864@quasar.ipa.nw.ru>; from avv@quasar.ipa.nw.ru on Fri, Aug 02, 2002 at 05:33:06 +0200
On 2002.08.02 05:33 Alexander V. Voinov wrote:
> Hi All,
>
> I have an application, which parses a huge XML file and stores resulting
> records to a database.
>
> The file is parsed using PXP, but in a 'pulldom' manner, by extracting
> (to a Buffer) first level tags manually with pcre, then an array insert
> of 30000 recognized and accumulated records is performed. DB access
> takes a small fraction of the run time.
>
> Compiled with ocaml-3.04 it took 1h40m+-5m of 'user' process time and
> occupied about 340M in RAM. With 3.05 it took 2h40m+-5m and occupied
> 250M.
>
> Is this the consequence of the new GC strategy? Actually I'd tolerate
> large footprint for the sake of more speed.
>
> It's also interesting to note, than in the case of 3.04 the footprint of
> the application starts from 330M and slowly expands to 350M. With 3.05
> it starts with 250M and then almost does not expand till the end.
>
> Sparc Solaris 2.7, gcc 3.0.4.
>
> A previous version of this app, written in Python with PyXML, runs 3-4
> times slower than the 3.04 version and takes 20M in RAM.
I think you observe GC compaction. You can turn it off:
OCAMLRUNPARAM="O=1000000" (or Gc.set).
If XML validation is not needed, you could also rewrite your program
to use the new event-based parsing in PXP-1.1.90. That would completely
avoid to represent the XML tree in memory (and increase the speed, because
GC of large memory footprints is expensive).
Gerd
--
----------------------------------------------------------------------------
Gerd Stolpmann Telefon: +49 6151 997705 (privat)
Viktoriastr. 45
64293 Darmstadt EMail: gerd@gerd-stolpmann.de
Germany
----------------------------------------------------------------------------
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
next prev parent reply other threads:[~2002-08-03 12:34 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2002-08-02 3:33 Alexander V. Voinov
2002-08-03 12:33 ` Gerd Stolpmann [this message]
2002-08-03 17:27 ` [Caml-list] OCAMLRUNPARAM=b David Fox
2002-08-04 2:50 ` [Caml-list] ocaml-3.05: a performance experience Alexander V. Voinov
2002-08-04 20:45 ` Gerd Stolpmann
2002-08-05 15:18 ` John Max Skaller
2002-08-05 16:24 ` Mike Lin
2002-08-05 16:53 ` Alexander V.Voinov
2002-08-06 3:22 ` John Max Skaller
2002-08-06 13:24 ` Mike Lin
2002-08-06 11:10 ` Noel Welsh
2002-08-06 12:56 ` Andreas Rossberg
2002-08-04 18:06 Damien Doligez
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20020803123311.GA631@ice.gerd-stolpmann.de \
--to=info@gerd-stolpmann.de \
--cc=avv@quasar.ipa.nw.ru \
--cc=caml-list@inria.fr \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox