* Weak hashtables & aggressive caching
@ 2006-08-14 14:58 Matt Gushee
2006-08-14 15:47 ` [Caml-list] " Richard Jones
2006-08-14 21:23 ` Jacques Garrigue
0 siblings, 2 replies; 12+ messages in thread
From: Matt Gushee @ 2006-08-14 14:58 UTC (permalink / raw)
To: caml-list
Hello, all--
I wrote a LablGTK-based image viewer this past weekend; one of its
features is an image cache--specifically, a weak hashtable that contains
values of type string * GdkPixbuf.pixbuf (the string being the file
name). When a particular image file is requested, it is retrieved from
the cache if it exists there; otherwise it is loaded from disk (and
placed in the cache at the same time). This is useful if the user wants
to quickly look back through a series of images that have already been
loaded, but it doesn't help with loading images for the first time.
It seems to me it might be useful to implement an aggressive caching
strategy--i.e., since the files to be loaded are known in advance (from
the command line), there could be a low-priority thread that would look
ahead and load images before the user requests them. Of course, if too
many images are loaded it might trigger the garbage collector, which
would defeat the whole purpose. Ideally, preloading should stop somewhat
before garbage collection starts.
From the documentation, it appears that the GC.stat and GC.control
functions could be used to regulate the caching behavior, but I have not
worked with the GC module before. Has anyone done something like this?
Is it worth the effort? Any non-obvious pitfalls I should be aware of?
--
Matt Gushee
: Bantam - lightweight file manager : matt.gushee.net/software/bantam/ :
: RASCL's A Simple Configuration Language : matt.gushee.net/rascl/ :
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Caml-list] Weak hashtables & aggressive caching
2006-08-14 14:58 Weak hashtables & aggressive caching Matt Gushee
@ 2006-08-14 15:47 ` Richard Jones
2006-08-14 16:28 ` Matt Gushee
2006-08-14 21:23 ` Jacques Garrigue
1 sibling, 1 reply; 12+ messages in thread
From: Richard Jones @ 2006-08-14 15:47 UTC (permalink / raw)
To: Matt Gushee; +Cc: caml-list
On Mon, Aug 14, 2006 at 08:58:29AM -0600, Matt Gushee wrote:
> It seems to me it might be useful to implement an aggressive caching
> strategy--i.e., since the files to be loaded are known in advance (from
> the command line),[...]
Please no! When running X remotely this will cause images to be
transferred (uncompressed) over the network and stored inside the X
server when they may not even be viewed. This sort of thing is
already a serious problem with programs like 'eog', making them
virtually unusable remotely.
Rich.
--
Richard Jones, CTO Merjis Ltd.
Merjis - web marketing and technology - http://merjis.com
Team Notepad - intranets and extranets for business - http://team-notepad.com
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Caml-list] Weak hashtables & aggressive caching
2006-08-14 15:47 ` [Caml-list] " Richard Jones
@ 2006-08-14 16:28 ` Matt Gushee
[not found] ` <44E0A8F1.8060504@janestcapital.com>
2006-08-14 18:18 ` Richard Jones
0 siblings, 2 replies; 12+ messages in thread
From: Matt Gushee @ 2006-08-14 16:28 UTC (permalink / raw)
To: caml-list
Richard Jones wrote:
> On Mon, Aug 14, 2006 at 08:58:29AM -0600, Matt Gushee wrote:
>> It seems to me it might be useful to implement an aggressive caching
>> strategy--i.e., since the files to be loaded are known in advance (from
>> the command line),[...]
>
> Please no! When running X remotely this will cause images to be
> transferred (uncompressed) over the network and stored inside the X
> server when they may not even be viewed. This sort of thing is
> already a serious problem with programs like 'eog', making them
> virtually unusable remotely.
Hmm ... well, I happen to have the heretical view that in an age of
cheap, powerful PCs and inexpensive software, running X remotely is just
plain absurd in most situations. Okay, yeah, there are thin clients, but
who actually uses them--other than a few large corporations, for whom I
have no sympathy?
However, I also know that my philosophy is on the fringe, and from a
practical standpoint people actually do some of these absurd things, so
... thanks for the heads-up.
Wait a minute, though. According to the Gdk reference manual,
<http://developer.gnome.org/doc/API/2.0/gdk/gdk-Pixbufs.html#id2861842>
Pixbufs are client-side images.
If that's true, I don't understand how loading pixbufs from files would
affect the X server.
--
Matt Gushee
: Bantam - lightweight file manager : matt.gushee.net/software/bantam/ :
: RASCL's A Simple Configuration Language : matt.gushee.net/rascl/ :
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Caml-list] Weak hashtables & aggressive caching
[not found] ` <44E0A8F1.8060504@janestcapital.com>
@ 2006-08-14 17:35 ` Matt Gushee
0 siblings, 0 replies; 12+ messages in thread
From: Matt Gushee @ 2006-08-14 17:35 UTC (permalink / raw)
To: caml-list
I will try to make this my last off-topic message on this subject.
Brian Hurt wrote:
> I'm running X remotely to access remote machines (note the plural). One
> of the advantages of X is that I can run GUI apps on machines that I'm
> not sitting in front of.
And what percentage of the computer-using population do you suppose has
*ever* done that?
> I'm also using RealVNC to log into other
> Windows machines. Please don't assume *your* situation is *everyone's*
> situation, as this makes your software signifigantly less usefull.
No. It limits the population of users for whom the software is useful,
which is a very different matter. Don't make assumptions about what I
assume. I know very well there are different kinds of users; where my
thinking differs from the mainstream is that I believe it is
impossible--or at least very difficult--to create software that delivers
a good user experience for all types of users.
To take one example, what tool would you use to develop a Web site? Some
people find Cold Fusion highly productive. That's fine. I find Vim to be
far more productive than any other tool I've tried, at least for the
kinds of Web sites I develop (mostly my own). I'd bet a large sum of
money that either one is far better for its target users than some
hypothetical app that tried to address both groups.
BTW, some of the leading thinkers on human-computer interaction (e.g.
Jef Raskin and Alan Cooper) have argued--based on extensive
research--that offering many different ways to accomplish a task is
usually bad for usability. They're talking about user interfaces, but
their thinking is at least consistent with my broader claim that no
single app is suitable for all circumstances.
Anyway, if I release an app to the public, I try to be very clear--as
clear as you can be in words and screenshots--about what it does and
doesn't do, and what kinds of users and usage situations it is suitable
for. If people don't want to use my software, that's fine. If I can't
develop something that will bring in significant income--and I long ago
gave up hope of doing that--I'll bloody well develop something I like.
As long as I'm clear about what I like, and don't expect the whole world
to agree with me, I don't see why that's a problem.
--
Matt Gushee
: Bantam - lightweight file manager : matt.gushee.net/software/bantam/ :
: RASCL's A Simple Configuration Language : matt.gushee.net/rascl/ :
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Caml-list] Weak hashtables & aggressive caching
2006-08-14 16:28 ` Matt Gushee
[not found] ` <44E0A8F1.8060504@janestcapital.com>
@ 2006-08-14 18:18 ` Richard Jones
2006-08-14 23:25 ` Matt Gushee
1 sibling, 1 reply; 12+ messages in thread
From: Richard Jones @ 2006-08-14 18:18 UTC (permalink / raw)
To: Matt Gushee; +Cc: caml-list
On Mon, Aug 14, 2006 at 10:28:39AM -0600, Matt Gushee wrote:
> Wait a minute, though. According to the Gdk reference manual,
> <http://developer.gnome.org/doc/API/2.0/gdk/gdk-Pixbufs.html#id2861842>
>
> Pixbufs are client-side images.
Ah right, pixbufs, pixmaps ... In that case why bother preloading
them at all? eog is flagrant with regards to pixmaps because the
developers believe it allows them to display images quickly (the
images are already on the X server, converted from JPEGs into raw
pixels). In this age of fast CPUs and slow RAM this is unlikely to be
the case.
Rich.
--
Richard Jones, CTO Merjis Ltd.
Merjis - web marketing and technology - http://merjis.com
Team Notepad - intranets and extranets for business - http://team-notepad.com
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Caml-list] Weak hashtables & aggressive caching
2006-08-14 14:58 Weak hashtables & aggressive caching Matt Gushee
2006-08-14 15:47 ` [Caml-list] " Richard Jones
@ 2006-08-14 21:23 ` Jacques Garrigue
2006-08-14 23:30 ` Matt Gushee
2006-08-15 4:55 ` skaller
1 sibling, 2 replies; 12+ messages in thread
From: Jacques Garrigue @ 2006-08-14 21:23 UTC (permalink / raw)
To: matt; +Cc: caml-list
From: Matt Gushee <matt@gushee.net>
> I wrote a LablGTK-based image viewer this past weekend; one of its
> features is an image cache--specifically, a weak hashtable that contains
> values of type string * GdkPixbuf.pixbuf (the string being the file
> name). When a particular image file is requested, it is retrieved from
> the cache if it exists there; otherwise it is loaded from disk (and
> placed in the cache at the same time). This is useful if the user wants
> to quickly look back through a series of images that have already been
> loaded, but it doesn't help with loading images for the first time.
I wonder how you trigger the GC, to both keep the cache long enough,
and to avoid filling the memory too much, and resulting in lots of
swapping.
With ocaml data structures, the GC does a good job, as it is
triggered everytime already allocated memory is filled. Hopefully this
means that the memory set should not increase. But with external data
structures like pixbufs, the GC is called in a pre-programmed way,
currently at least after every 10 pixbuf allocations. This is probably
too much for your scheme (you won't get more than 9 images in memory),
but less might be not enough (big images will fill the memory without
calling the GC earlier.)
Considering the difficulties avoid memory overflow, the only workable
approach still seems to have an over-eager GC, that happens much more
often than necessary. But as a result the caching effect is very
limited. Otherwise you need to change all the parameters in lablgtk.
Jacques Garrigue
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Caml-list] Weak hashtables & aggressive caching
2006-08-14 18:18 ` Richard Jones
@ 2006-08-14 23:25 ` Matt Gushee
0 siblings, 0 replies; 12+ messages in thread
From: Matt Gushee @ 2006-08-14 23:25 UTC (permalink / raw)
To: caml-list
Richard Jones wrote:
> On Mon, Aug 14, 2006 at 10:28:39AM -0600, Matt Gushee wrote:
>> Wait a minute, though. According to the Gdk reference manual,
>> <http://developer.gnome.org/doc/API/2.0/gdk/gdk-Pixbufs.html#id2861842>
>>
>> Pixbufs are client-side images.
>
> Ah right, pixbufs, pixmaps ... In that case why bother preloading
> them at all?
Well, maybe I shouldn't. That's why I asked if it was worth the effort.
> eog is flagrant with regards to pixmaps because the
> developers believe it allows them to display images quickly (the
> images are already on the X server, converted from JPEGs into raw
> pixels). In this age of fast CPUs and slow RAM this is unlikely to be
> the case.
Thanks for your insights.
--
Matt Gushee
: Bantam - lightweight file manager : matt.gushee.net/software/bantam/ :
: RASCL's A Simple Configuration Language : matt.gushee.net/rascl/ :
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Caml-list] Weak hashtables & aggressive caching
2006-08-14 21:23 ` Jacques Garrigue
@ 2006-08-14 23:30 ` Matt Gushee
2006-08-16 0:54 ` Jacques Garrigue
2006-08-15 4:55 ` skaller
1 sibling, 1 reply; 12+ messages in thread
From: Matt Gushee @ 2006-08-14 23:30 UTC (permalink / raw)
To: caml-list
Jacques Garrigue wrote:
> I wonder how you trigger the GC, to both keep the cache long enough,
> and to avoid filling the memory too much, and resulting in lots of
> swapping.
I wasn't planning to trigger the GC explicitly. My thought was simply to
stop preloading before GC begins (or at least *when* GC begins).
> means that the memory set should not increase. But with external data
> structures like pixbufs, the GC is called in a pre-programmed way,
> currently at least after every 10 pixbuf allocations.
You mean that LablGTK directly invokes the garbage collector after 10
images. That's not much (unless, of course, they are big images). Sounds
like it's a lot of trouble for a small benefit.
--
Matt Gushee
: Bantam - lightweight file manager : matt.gushee.net/software/bantam/ :
: RASCL's A Simple Configuration Language : matt.gushee.net/rascl/ :
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Caml-list] Weak hashtables & aggressive caching
2006-08-14 21:23 ` Jacques Garrigue
2006-08-14 23:30 ` Matt Gushee
@ 2006-08-15 4:55 ` skaller
2006-08-15 16:17 ` Matt Gushee
1 sibling, 1 reply; 12+ messages in thread
From: skaller @ 2006-08-15 4:55 UTC (permalink / raw)
To: Jacques Garrigue; +Cc: matt, caml-list
On Tue, 2006-08-15 at 06:23 +0900, Jacques Garrigue wrote:
> From: Matt Gushee <matt@gushee.net>
>
> > I wrote a LablGTK-based image viewer this past weekend; one of its
> > features is an image cache--specifically, a weak hashtable that contains
> > values of type string * GdkPixbuf.pixbuf (the string being the file
> > name). When a particular image file is requested, it is retrieved from
> > the cache if it exists there; otherwise it is loaded from disk (and
> > placed in the cache at the same time). This is useful if the user wants
> > to quickly look back through a series of images that have already been
> > loaded, but it doesn't help with loading images for the first time.
>
> I wonder how you trigger the GC, to both keep the cache long enough,
> and to avoid filling the memory too much, and resulting in lots of
> swapping.
I'm confused. First, a pixmap doesn't have any pointers in it,
so it doesn't need to be scanned by the GC.
Second, you'd need a LOT of images to come even close
to running out of address space (on a 64 bit machine anyhow :)
And third, there would be no swapping, unless you were
flicking between the images .. in which case there'd
be swapping no matter what.
> Considering the difficulties avoid memory overflow,
I have thousands of images and I can scan them at full size
very fast with GQView .. I can only barely see the drawing
happen .. it almost keeps up with the keyboard repeat rate
at full screen size .. and that includes *scaling* the images.
Mind you .. GQView is extremely quick and it knows when to move on
(interrupts rendering when you tell it to view a new image).
(this is with a low end nVidia card on an amd64 3200 single core/1GRam)
Lets get real here: the difficulties arise editing video,
not still pictures.
--
John Skaller <skaller at users dot sf dot net>
Felix, successor to C++: http://felix.sf.net
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Caml-list] Weak hashtables & aggressive caching
2006-08-15 4:55 ` skaller
@ 2006-08-15 16:17 ` Matt Gushee
0 siblings, 0 replies; 12+ messages in thread
From: Matt Gushee @ 2006-08-15 16:17 UTC (permalink / raw)
To: caml-list
skaller wrote:
>> I wonder how you trigger the GC, to both keep the cache long enough,
>> and to avoid filling the memory too much, and resulting in lots of
>> swapping.
>
> I'm confused. First, a pixmap doesn't have any pointers in it,
> so it doesn't need to be scanned by the GC.
Does that statement apply to a GdkPixbuf.pixbuf? That is the type I am
using.
I took Jacques' statement to mean that LablGTK was explicitly invoking
the GC--though of course I'd like to hear his answer on that point.
> Second, you'd need a LOT of images to come even close
> to running out of address space (on a 64 bit machine anyhow :)
:) Of course, many people are still using those antiquated 32-bit
processors. I know that real software developers use overpowered
machines to help insulate them from the constraints that face ordinary
users. Me, I can't afford a powerful computer, so I guess I'm not a real
developer.
> I have thousands of images and I can scan them at full size
> very fast with GQView .. I can only barely see the drawing
> happen .. it almost keeps up with the keyboard repeat rate
> at full screen size .. and that includes *scaling* the images.
> Mind you .. GQView is extremely quick
Interesting. For me it's neither fast nor slow.
> and it knows when to move on
> (interrupts rendering when you tell it to view a new image).
That's good. I would like to know (or figure out) how to do that with
LablGTK.
> Lets get real here: the difficulties arise editing video,
> not still pictures.
Except for those of us with really old hardware. I imagine there are a
lot of such folks in Africa; and seeing as America is rapidly becoming a
Third World country, maybe more then you'd expect here.
--
Matt Gushee
: Bantam - lightweight file manager : matt.gushee.net/software/bantam/ :
: RASCL's A Simple Configuration Language : matt.gushee.net/rascl/ :
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Caml-list] Weak hashtables & aggressive caching
2006-08-14 23:30 ` Matt Gushee
@ 2006-08-16 0:54 ` Jacques Garrigue
2006-08-16 4:33 ` Matt Gushee
0 siblings, 1 reply; 12+ messages in thread
From: Jacques Garrigue @ 2006-08-16 0:54 UTC (permalink / raw)
To: matt; +Cc: caml-list
From: Matt Gushee <matt@gushee.net>
> > I wonder how you trigger the GC, to both keep the cache long enough,
> > and to avoid filling the memory too much, and resulting in lots of
> > swapping.
>
> I wasn't planning to trigger the GC explicitly. My thought was simply to
> stop preloading before GC begins (or at least *when* GC begins).
But, if you wait for the GC to begin this is too late: all your weak
references will be collected as garbage, so that your cache will be
emptied as soon as you fill it.
> > means that the memory set should not increase. But with external data
> > structures like pixbufs, the GC is called in a pre-programmed way,
> > currently at least after every 10 pixbuf allocations.
>
> You mean that LablGTK directly invokes the garbage collector after 10
> images. That's not much (unless, of course, they are big images). Sounds
> like it's a lot of trouble for a small benefit.
Again, the trouble is that there is only one allocation function for
pixbufs, and it doesn't look at their size. And it isn't aware of how
much memory is available either. So the choice was to be extremely
conservative. This is maybe a bad idea, but the intent is to avoid
keeping big garbage around, as I have seen really bad situations in
the past (programs growing to more than 100MB pretty fast.) Since weak
references are counted as garbage, there is clearly a contradiction.
I suppose more GC tuning in lablgtk would be a good thing. But I
really don't see how to do it easily with the ocaml allocation API.
The only way to interface external allocation with the GC is an
increment N you pass when calling alloc_custom. It tells ocaml to
shorten the time to next GC by N % (actually this is a ratio, so you
can provide smaller increments.) The trouble is that the GC is
triggered by the sum of all increments for all allocations. So if you
want to slow it, you need to reduce all increments everywhere...
Jacques Garrigue
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Caml-list] Weak hashtables & aggressive caching
2006-08-16 0:54 ` Jacques Garrigue
@ 2006-08-16 4:33 ` Matt Gushee
0 siblings, 0 replies; 12+ messages in thread
From: Matt Gushee @ 2006-08-16 4:33 UTC (permalink / raw)
To: caml-list
Jacques Garrigue wrote:
>>> means that the memory set should not increase. But with external data
>>> structures like pixbufs, the GC is called in a pre-programmed way,
>>> currently at least after every 10 pixbuf allocations.
>> You mean that LablGTK directly invokes the garbage collector after
10 images. That's not much (unless, of course, they are big images).
Sounds like it's a lot of trouble for a small benefit.
>
> Again, the trouble is that there is only one allocation function for
> pixbufs, and it doesn't look at their size. And it isn't aware of how
> much memory is available either. So the choice was to be extremely
> conservative.
I'm sorry. I meant that my notion of preloading images would be a lot of
trouble for a small benefit. I don't have sufficient expertise to judge
your garbage collection strategy.
Anyway, thanks for the explanation.
--
Matt Gushee
: Bantam - lightweight file manager : matt.gushee.net/software/bantam/ :
: RASCL's A Simple Configuration Language : matt.gushee.net/rascl/ :
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2006-08-16 4:33 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-08-14 14:58 Weak hashtables & aggressive caching Matt Gushee
2006-08-14 15:47 ` [Caml-list] " Richard Jones
2006-08-14 16:28 ` Matt Gushee
[not found] ` <44E0A8F1.8060504@janestcapital.com>
2006-08-14 17:35 ` Matt Gushee
2006-08-14 18:18 ` Richard Jones
2006-08-14 23:25 ` Matt Gushee
2006-08-14 21:23 ` Jacques Garrigue
2006-08-14 23:30 ` Matt Gushee
2006-08-16 0:54 ` Jacques Garrigue
2006-08-16 4:33 ` Matt Gushee
2006-08-15 4:55 ` skaller
2006-08-15 16:17 ` Matt Gushee
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox