From: Markus Mottl <markus.mottl@gmail.com>
To: Brighten Godfrey <pbg@cs.berkeley.edu>
Cc: Alain Frisch <alain@frisch.fr>, OCaml List <caml-list@yquem.inria.fr>
Subject: Re: [Caml-list] Strange performance bug
Date: Wed, 29 Apr 2009 09:58:33 -0400 [thread overview]
Message-ID: <f8560b80904290658p6f5cacb9vb6a2cec1c77359a4@mail.gmail.com> (raw)
In-Reply-To: <6D9C5A68-1874-4BBC-AE3D-9CCC3614AF7C@cs.berkeley.edu>
On Wed, Apr 29, 2009 at 04:29, Brighten Godfrey <pbg@cs.berkeley.edu> wrote:
> I know nothing about the internals of these libraries. But, the program is
> continuously reading lines from the file. Thus, isn't there about the same
> amount of memory on the heap just before the problem starts and just after
> the problem starts? I guess it is plausible that somehow, closing the file
> and re-opening it triggers a bad interaction with the GC...
>
> But in comparison, using Str in the same way (i.e., compiling the regexp
> every time it is used) works fine.
Note that the effect of not precompiling the regular expressions is
not just the overhead of this computation, but also vastly greater
GC-pressure.
The current GC-settings in Pcre will trigger a full GC-cycle every 500
regular expressions allocated, i.e. would perform a full major
collection every 500 lines in your case. This setting works fine for
just about any application I've seen, because virtually nobody has to
create patterns dynamically at rates so high that this matters.
Thus, try hoisting out the compilation of the regexp first...
Markus
--
Markus Mottl http://www.ocaml.info markus.mottl@gmail.com
next prev parent reply other threads:[~2009-04-29 13:58 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-04-29 2:43 Brighten Godfrey
2009-04-29 3:37 ` [Caml-list] " Markus Mottl
2009-04-29 4:31 ` Brighten Godfrey
2009-04-29 6:18 ` Alain Frisch
2009-04-29 6:27 ` Brighten Godfrey
2009-04-29 6:37 ` Alain Frisch
2009-04-29 8:29 ` Brighten Godfrey
2009-04-29 13:58 ` Markus Mottl [this message]
2009-04-29 14:48 ` Damien Doligez
2009-04-29 16:03 ` Markus Mottl
2009-04-29 19:19 ` Brighten Godfrey
2009-04-29 19:38 ` Markus Mottl
2009-04-29 20:23 ` Brighten Godfrey
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=f8560b80904290658p6f5cacb9vb6a2cec1c77359a4@mail.gmail.com \
--to=markus.mottl@gmail.com \
--cc=alain@frisch.fr \
--cc=caml-list@yquem.inria.fr \
--cc=pbg@cs.berkeley.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox