* Memory usage/ garbage collection question
@ 2005-10-14 9:49 Richard Jones
2005-10-14 10:02 ` [Caml-list] " skaller
` (2 more replies)
0 siblings, 3 replies; 5+ messages in thread
From: Richard Jones @ 2005-10-14 9:49 UTC (permalink / raw)
To: caml-list
I'm trying to optimise a program which is using a large amount of
memory and consequently thrashing.
The core of the program is an iteration over a list of something like
a million elements which consumes about 1/2 gig of RAM. The iteration
is:
List.iter (
fun row ->
(* put row into database and forget about it *)
) rows;
(* no further references to rows after this *)
This is the stdlib implementation of List.iter. Should the garbage
collector be able to collect the part of the list which has been
iterated over, during the iteration? At the moment it doesn't look
like it's doing so.
Rich.
--
Richard Jones, CTO Merjis Ltd.
Merjis - web marketing and technology - http://merjis.com
Team Notepad - intranets and extranets for business - http://team-notepad.com
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Caml-list] Memory usage/ garbage collection question
2005-10-14 9:49 Memory usage/ garbage collection question Richard Jones
@ 2005-10-14 10:02 ` skaller
2005-10-14 10:08 ` Olivier Andrieu
[not found] ` <c7ee61120510140258q5b7f393l8e3c2c3d45f49008@mail.gmail.com>
2 siblings, 0 replies; 5+ messages in thread
From: skaller @ 2005-10-14 10:02 UTC (permalink / raw)
To: Richard Jones; +Cc: caml-list
On Fri, 2005-10-14 at 10:49 +0100, Richard Jones wrote:
> I'm trying to optimise a program which is using a large amount of
> memory and consequently thrashing.
>
> The core of the program is an iteration over a list of something like
> a million elements which consumes about 1/2 gig of RAM. The iteration
> is:
>
> List.iter (
> fun row ->
> (* put row into database and forget about it *)
> ) rows;
> (* no further references to rows after this *)
>
> This is the stdlib implementation of List.iter. Should the garbage
> collector be able to collect the part of the list which has been
> iterated over, during the iteration? At the moment it doesn't look
> like it's doing so.
let rows = ref [] in
(* build up the list of rows here *)
let rec doit rows = match rows with
| [] -> ()
| h :: t -> install_in_db h; doit t (* tail call *)
in
doit (let r = !rows in rows := []; r)
--
John Skaller <skaller at users dot sf dot net>
Felix, successor to C++: http://felix.sf.net
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Caml-list] Memory usage/ garbage collection question
2005-10-14 9:49 Memory usage/ garbage collection question Richard Jones
2005-10-14 10:02 ` [Caml-list] " skaller
@ 2005-10-14 10:08 ` Olivier Andrieu
[not found] ` <c7ee61120510140258q5b7f393l8e3c2c3d45f49008@mail.gmail.com>
2 siblings, 0 replies; 5+ messages in thread
From: Olivier Andrieu @ 2005-10-14 10:08 UTC (permalink / raw)
To: Richard Jones; +Cc: caml-list
Hi,
Richard Jones [Friday 14 October 2005] :
> I'm trying to optimise a program which is using a large amount of
> memory and consequently thrashing.
>
> The core of the program is an iteration over a list of something like
> a million elements which consumes about 1/2 gig of RAM. The iteration
> is:
>
> List.iter (
> fun row ->
> (* put row into database and forget about it *)
> ) rows;
> (* no further references to rows after this *)
>
> This is the stdlib implementation of List.iter. Should the garbage
> collector be able to collect the part of the list which has been
> iterated over, during the iteration? At the moment it doesn't look
> like it's doing so.
« Short answer: with ocamlc, no. With ocamlopt, yes. »
cf. these informatives messages by Xavier:
http://groups.google.com/group/fa.caml/msg/f194a3240d240e71
http://groups.google.com/group/fa.caml/msg/d36cb040d0d87ca6
--
Olivier
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Caml-list] Memory usage/ garbage collection question
[not found] ` <c7ee61120510140258q5b7f393l8e3c2c3d45f49008@mail.gmail.com>
@ 2005-10-14 10:27 ` Richard Jones
2005-10-14 10:51 ` Frederic van der Plancke
0 siblings, 1 reply; 5+ messages in thread
From: Richard Jones @ 2005-10-14 10:27 UTC (permalink / raw)
To: Seth J. Fogarty; +Cc: caml-list
On Fri, Oct 14, 2005 at 04:58:59AM -0500, Seth J. Fogarty wrote:
> I do not see why iterating through a list that consumed a lot of
> memory should (innately) cause you to thrash. What is thrashing?
> access to the disk? Garbage collection?
That's where I'm not really sure, except that it is observably
thrashing. Since it's a simple iteration, I guess that would
implicate the GC?
> No, because you have bound rows to a name. Now, I believe if rows is
> returned by a function, and is NOT bound by name in that function, it
> can be garbage collected. I.E.
Ah OK ... I'm interested though: why does binding a value to a name
cause problems? Surely at this level (ocamlopt generated code) there
ought to be no difference between a named value and an unnamed one?
Rich.
--
Richard Jones, CTO Merjis Ltd.
Merjis - web marketing and technology - http://merjis.com
Team Notepad - intranets and extranets for business - http://team-notepad.com
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Caml-list] Memory usage/ garbage collection question
2005-10-14 10:27 ` Richard Jones
@ 2005-10-14 10:51 ` Frederic van der Plancke
0 siblings, 0 replies; 5+ messages in thread
From: Frederic van der Plancke @ 2005-10-14 10:51 UTC (permalink / raw)
To: caml-list
Richard Jones wrote:
>
> On Fri, Oct 14, 2005 at 04:58:59AM -0500, Seth J. Fogarty wrote:
> > I do not see why iterating through a list that consumed a lot of
> > memory should (innately) cause you to thrash. What is thrashing?
> > access to the disk? Garbage collection?
>
> That's where I'm not really sure, except that it is observably
> thrashing. Since it's a simple iteration, I guess that would
> implicate the GC?
>
> > No, because you have bound rows to a name. Now, I believe if rows is
> > returned by a function, and is NOT bound by name in that function, it
> > can be garbage collected. I.E.
>
> Ah OK ... I'm interested though: why does binding a value to a name
> cause problems? Surely at this level (ocamlopt generated code) there
> ought to be no difference between a named value and an unnamed one?
>
> Rich.
Perhaps the problem is not the name, but the fact that when you write
List.iter f rows
the compiler isn't going to go great lengths in optimizing out the reference to rows before the call to List.iter, since the optimisation probably isn't free and the compiler (and compiler writers) don't know in advance how worthwhile it would be.
If the call to List.iter was a tail call the compiler would probably be forced to optimise rows out.
Hence my 0.02 cents worth untested idea: you might try and use tail call optimisation to get rid of that reference.
say
let do_work () =
let rows = ... in
List.iter (...) rows
Frédéric
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2005-10-14 10:49 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-10-14 9:49 Memory usage/ garbage collection question Richard Jones
2005-10-14 10:02 ` [Caml-list] " skaller
2005-10-14 10:08 ` Olivier Andrieu
[not found] ` <c7ee61120510140258q5b7f393l8e3c2c3d45f49008@mail.gmail.com>
2005-10-14 10:27 ` Richard Jones
2005-10-14 10:51 ` Frederic van der Plancke
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox