From: Brian Hurt <bhurt@janestcapital.com>
Cc: caml-list@inria.fr
Subject: Re: [Caml-list] OCaml runtime using too much memory in 64-bit Linux
Date: Wed, 14 Nov 2007 08:45:31 -0500 [thread overview]
Message-ID: <473AFBFB.1090802@janestcapital.com> (raw)
In-Reply-To: <473AF04C.7030107@inria.fr>
Xavier Leroy wrote:
>
>If the problem is confirmed, there are several ways to go about it.
>One is to implement the page table with a sparse data structure,
>e.g. a hash table. However, the major GC and some primitives like
>polymorphic equality perform *lots* of page table lookups, so a
>performance hit is to be expected.
>
I've been contemplating doing this on my own (not at work, I comment),
just to see how much of a hit it is. If no one else steps up to the
plate, I will.
One important comment I will make, in moving to a hash table the size of
pages has to increase. The current implementation uses 1 byte per (4K)
page for the map- a hash table would use, I project, about 4 words (16
or 32 bytes) per page. To keep the memory utilization equal, we'd need
to go for a larger page size- 64K for 32-bit systems. I'd be inclined
to go larger than that- the 4K page size was standardized back when the
average system (using memory protection) has 1MB, and 16MB was a huge
amount of memory. Memory sizes have increased 1024-fold in the same
time, meaning I could make an argument for 4M pages. I'm not sure I
would (I think you'd run into a fragmentation problem on 32-bit), but it
makes the page sizes I'd probably settle on (256K for 32-bit, 1M for
64-bit) a lot more palatable. :-)
An advantage large pages would have is that they'd make the table a lot
smaller, and thus a lot more cache friendly. I mean, think about it-
1GB of 4K pages at 1 byte per page is 256K, while 1GB of 256K pages at
16 bytes per page is 64K, 1/4 the size. 1GB of 1M pages at 32 bytes per
page is 32K, smaller yet. The smaller table is more likely to fit into
cache, and cheaper to load into cache. While I wouldn't want to
gaurentee anything, I could easily see the smaller table size that fits
into cache actually gives a performance boost. I've certainly seen
weirder things happen.
>The other is to revise OCaml's
>data representations so that the GC and polymorphic primitives no
>longer need to know which pointers fall in the major heap. This seems
>possible in principle, but will take quite a bit of work and break a
>lot of badly written C/OCaml interface code. You've been warned :-)
>
>
>
Of the two, I think I like the hashtable idea better.
If it isn't clear, this whole email is just speaking for myself. I'm
not allowed to speak for Jane Street (heck, if I'm luck, I'm allowed to
speak *to* Jane Street :-).
Brian
next prev parent reply other threads:[~2007-11-14 13:45 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-11-07 17:28 Adam Chlipala
2007-11-07 18:20 ` [Caml-list] " Gerd Stolpmann
2007-11-07 19:12 ` Adam Chlipala
2007-11-08 12:56 ` Samuel Mimram
2007-11-14 4:20 ` Romain Beauxis
2007-11-14 12:03 ` Vladimir Shabanov
2007-11-14 12:55 ` Xavier Leroy
2007-11-14 13:45 ` Brian Hurt [this message]
2007-11-14 14:16 ` Romain Beauxis
2007-11-14 15:56 ` Markus Mottl
2007-11-14 16:22 ` Stefan Monnier
2007-11-14 16:36 ` [Caml-list] " Brian Hurt
2007-11-14 17:08 ` Lionel Elie Mamane
2007-11-14 17:26 ` Stefan Monnier
2007-11-14 16:45 ` Lionel Elie Mamane
2007-11-14 17:08 ` Lionel Elie Mamane
2007-11-08 20:51 ` [Caml-list] " Romain Beauxis
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=473AFBFB.1090802@janestcapital.com \
--to=bhurt@janestcapital.com \
--cc=caml-list@inria.fr \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox