Mailing list for all users of the OCaml language and system.
 help / color / mirror / Atom feed
From: Yotam Barnoy <yotambarnoy@gmail.com>
To: Goswin von Brederlow <goswin-v-b@web.de>
Cc: Ocaml Mailing List <caml-list@inria.fr>
Subject: Re: [Caml-list] Proposal: re-design of ocaml headers
Date: Thu, 30 Jan 2014 15:53:51 -0500	[thread overview]
Message-ID: <CAN6ygO=r40yPrwKMrWQcRoiEuGZk7DwpAGmAC2-LFmf2nrAxOA@mail.gmail.com> (raw)
In-Reply-To: <CAN6ygOk4OLJN+Cvsips6nic16d7ZTomwceWPXX91O-S0k5xzKA@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 5185 bytes --]

I'm resurrecting this thread.

Given the recent discussion about optimization, I've done some more
thinking and simplification work on my proposal for a better ocaml header.
Much of this is just adopting Goswin's proposal, which made a lot of sense.
I tried to also reduce the variations while allowing for a lot of useful
unboxing of floats, int64s etc. I've also tried to tackle tuples in this
version -- basically, in order to be compatible with polymorphic functions,
you really want tuples and all other polymorphic types (as in 'a embedded
inside records) to be of uniform size. Expanding that size under some
circumstances to 64 bits on 32 bit platforms makes sense if, for example,
the tuple contains many floats. I don't think handling both what I call
narrow and wide polymorphic types (for lack of a better name) will make the
code much more complex -- I certainly don't want it to result in heavy
register spilling.

Also, the dynamic parts of the header now appear BEFORE the header itself.
Their highest bit is 0 to indicate that they're not the main header. This
will be seen by GC code, which can afford to do the few extra comparisons.
Mutator code will always have the main header available right before the
data.


So here it is:


+ For 32-bit architectures:
---------------------------

Wide Types
----------
- For polymorphic types (containing 'a), member sizes need to be uniform
for speed. Tuples are fully polymorphic. These polymorphic members can
either be narrow (32 bit) as they are now or wide (64 bit) to accomodate
float/int64 unboxing.

     +-----------+--------------+------+-----+-------+------+
     | 1 | wosize|    fbits     |ebits |noptr| color | tag  |
     +-----------+--------------+------+-----+-------+------+
bits  31  30   21 20          15   14     13  12   11 10   0

- noptr: no pointers present.
- fbits: wide types cannot be represented unless extbits is used
    - 00: tagged (int/pointer)
    - 01: int32/native int, C pointer
    - 10: float
    - 11: int64

- if wosize = 0, the size word is used for wosize

size word
---------
- only present if wosize is 0 in the header. Precedes the main header
     +---------------------------------------------+
     | 0 |               wosize                    |
     +---------------------------------------------+
bits  31  30                                      0

- bit 31 is used to identify this as a header extension


ext word
--------
- Only present if ebits is 1. Describes the first 15 members of the object
(+ 3 members from fbits)
     +----------+----------------------------------+
     | 0 | wide |        extbits                   |
     +----------+----------------------------------+
bits   31  30     29                              0

- bit 31 is used to identify this as a header extension
- wide: determines if 'a types in the first 18 words are wide (64 bit) or
narrow (32 bits)
- extbits: same as fbits


Tuples
------
- Tuples with any fbits on in the header word are automatically wide tuples
(64 bit) for the first 3 words
- Tuples with ebits are automatically wide tuples for the first 18 words.
wide is ignored.

Strings
-------
- In strings, the last fbit, ebits and noptr function as the string size
modifier (currently present at the end of the string). This improves cache
locality on large strings. Wosize expands to include the fbits.

Arrays
------
- Arrays of integers, int64, floats, C pointers etc can all be handled with
the regular array type.
- Only 2 lower fbits are needed for the type. The other fbits are joined
with wosize.


+ 64-bit architectures:
--------------------------

     +-----------+------------------+------+-----+-------+------+
     | 1 | wosize|    fbits         |ebits |noptr| color | tag  |
     +-----------+------------------+------+-----+-------+------+
bits  63  62   43 42              15   14     13  12   11 10   0


- noptr: a structure with no pointers.
- fbits: 2 bits per object member
    - 00: tagged (int/pointer)
    - 01: int32
    - 10: float
    - 11: int64/native int/C pointer

- ebits: use ext word for signaling bits. fbits in header become bottom
bits of wosize.


ext_word
--------
- Only present if ebits is 1. Describes the first 31 members of the object.
     +---------------------------------------------+
     | 0 | - |           extbits                   |
     +---------------------------------------------+
bits   63  62  61                                  0

- bit 31 is used to identify this as a header extension
- extbits: same as fbits

Strings
-------
- In strings, the last fbit, ebits and noptr function as the string size
modifier (currently present at the end of the string). This improves cache
locality on large strings. Wosize expands

Arrays
------
- Only the 2 lowest fbits are needed to discriminate the type. All other
fbits are joined to wosize.

+ Tags:
-------
- 0: Array tag
- 1: Record tag
- 2: Tuple tag
- 3: Infix tag
- 4: Closure tag
- 5: Lazy tag
- 6: Object tag
- 7: Forward tag
- 8: Abstract tag
- 9: String tag (pfbit type used for size completion)
- 10: Primitive value tag (double, int64, int32)
- 11: Custom tag
- 100: First user tag

Comments?
Yotam

[-- Attachment #2: Type: text/html, Size: 6012 bytes --]

  reply	other threads:[~2014-01-30 20:54 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-09-27 14:05 Yotam Barnoy
2013-09-27 15:08 ` Dmitry Grebeniuk
     [not found]   ` <CAN6ygOmuCX6HLfSns0tXQCF3LWMANqhpnSN0vGWcNg0one2QzQ@mail.gmail.com>
2013-09-27 15:25     ` [Caml-list] Fwd: " Yotam Barnoy
2013-09-27 16:20       ` Dmitry Grebeniuk
2013-09-27 18:08         ` Yotam Barnoy
2013-09-27 18:12           ` Yotam Barnoy
2013-09-27 18:15           ` Paolo Donadeo
2013-09-27 18:41             ` Yotam Barnoy
2013-09-27 15:31   ` [Caml-list] " Anthony Tavener
2013-09-27 15:37     ` Yotam Barnoy
2013-09-27 16:50     ` Dmitry Grebeniuk
2013-09-30 14:48 ` Goswin von Brederlow
2013-09-30 15:31   ` Yotam Barnoy
2013-10-08 10:52     ` Goswin von Brederlow
2013-10-11 15:48       ` Yotam Barnoy
2014-01-30 20:53         ` Yotam Barnoy [this message]
2014-02-01 15:27         ` Goswin von Brederlow
2013-10-06 10:39 ` Florian Weimer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAN6ygO=r40yPrwKMrWQcRoiEuGZk7DwpAGmAC2-LFmf2nrAxOA@mail.gmail.com' \
    --to=yotambarnoy@gmail.com \
    --cc=caml-list@inria.fr \
    --cc=goswin-v-b@web.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox