[Caml-list] Proposal: re-design of ocaml headers

Mailing list for all users of the OCaml language and system.
 help / color / mirror / Atom feed

* [Caml-list] Proposal: re-design of ocaml headers
@ 2013-09-27 14:05 Yotam Barnoy
  2013-09-27 15:08 ` Dmitry Grebeniuk
                   ` (2 more replies)
  0 siblings, 3 replies; 18+ messages in thread
From: Yotam Barnoy @ 2013-09-27 14:05 UTC (permalink / raw)
  To: Ocaml Mailing List

[-- Attachment #1: Type: text/plain, Size: 4946 bytes --]

Following up on the previous thread I started in this general topic, I
present some more thinking I've done on the redesign of ocaml's headers.
The purpose of this redesign is to lift the tag number restriction as well
as size limits on 32-bit platforms. At the same time, header bits can be
used to indicate floats, allowing cheaper usage of floats in data
structures, and even to indicate the presence of pointers, making the
traversal of some data structures by the GC unnecessary.

The basic idea of this redesign is that most allocations need only a small
amount of space, but a large number of tags is a necessity. If you're
allocating a large block of memory (>8KB) then you can spare another word
for the size.

The pfbits (as shown below) are an efficient way of representing both
floats and pointers in a data structure, at the cost of disallowing random
access. From what I can gather, random access to this data is never needed,
since the GC, Marshal module, and polymorphic comparison all process the
whole data structure rather than referring to specific parts of it.

Open issue: making float representation efficient on the stack.


+ For 16-bit and 32-bit architectures:
     +---------------+----+----+-----+-------+------+
     |     wosize    | ext|cust|noptr| color | tag  |
     +---------------+----+----+-----+-------+------+
bits  31           21  20   19   18   17   16 15   0

- noptr: no pointers present
- ext:  uses extension word
- cust(om): uses custom word. Custom word is normally used to indicate
floats and pointers.

32 bit extension word (present only if ext is 1)
     +---------------------------------------------+
     |                   wosize                    |
     +---------------------------------------------+
bits  31                                          0

32 bit custom word (default usage - present only if cust is 1):
     +----+----------------------------------------+
     |nofp|              pfbits                    |
     +----+----------------------------------------+
bits   31  30                                     0

- nofp: a structure with no floats. All pfbits are used for pointers, with
a 1 signifying a pointer and a 0 signifying a value.
- pfbits: indicates which double words are floats and pointers. Starting at
the highest bit:
    - a 0 indicates neither a pointer nor a float
    - a 10 indicates a float (double)
    - a 11 indicates a pointer
    - If noptr is set, each bit indicates a float. If nofp is set, each bit
indicates a pointer.

+ For 64-bit architectures:

     +----------------+--------+----+----+-----+-------+------+
     |     pfbits     | wosize |cust|nofp|noptr| color | tag  |
     +----------------+--------+----+----+-----+-------+------+
bits  63            40 39    21  20   19   18   17   16 15   0

- noptr: a structure with no pointers. All pfbits are used for floats, with
a 1 signifying a float and a 0 signifying a non-float.
- nofp: a structure with no floats. All pfbits are used for pointers, with
a 1 signifying a pointer and a 0 signifying a value.
- If both noptr and nofp are set, wosize is extended to include the pfbits.
- cust(om): uses custom double word. Custom double word is normally used to
indicate more floats and pointers, but functionality can change with
certain tags.
    - If the custom bit is set, wosize is expanded to include the pfbits in
the main header.
- pfbits: indicates which double words are floats and pointers. Starting at
the highest bit:
    - a 0 indicates neither a pointer nor a float
    - a 10 indicates a float (double)
    - a 11 indicates a pointer
    - If noptr is set, each bit indicates a float. If nofp is set, each bit
indicates a pointer.

64 bit custom header (default usage indicated - present only if cust is 1):
     +--------------------------------------------------------+
     |                         pfbits                         |
     +--------------------------------------------------------+
bits  63                                                     0

- pfbits: indicates which double words are floats and pointers. Starting at
the highest bit:
    - a 0 indicates neither a pointer nor a float
    - a 10 indicates a float (double)
    - a 11 indicates a pointer
    - If noptr is set, each bit indicates a float. If nofp is set, each bit
indicates a pointer.

+ Tags:
- I think it's a good idea to move custom tags to the low end of the
spectrum, and add more there if any are needed. This way, if the tag field
is ever expanded, it's not necessary to move the custom tags again.

- 0: Closure tag
- 1: Infix tag (must be 1 mod 4)
- 2: Lazy tag
- 3: Object tag
- 4: Forward tag
- 5: Abstract tag
- 6: String tag
- 7: Double tag
- 8: Custom tag
- 9: Proposed tag: custom array. Half of custom header is used to indicate
array member size, so one could have an array of tuples, saving both memory
and indirections.
- 100: Start of user tags

Yotam Barnoy

[-- Attachment #2: Type: text/html, Size: 5526 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Caml-list] Proposal: re-design of ocaml headers
  2013-09-27 14:05 [Caml-list] Proposal: re-design of ocaml headers Yotam Barnoy
@ 2013-09-27 15:08 ` Dmitry Grebeniuk
       [not found]   ` <CAN6ygOmuCX6HLfSns0tXQCF3LWMANqhpnSN0vGWcNg0one2QzQ@mail.gmail.com>
  2013-09-27 15:31   ` [Caml-list] " Anthony Tavener
  2013-09-30 14:48 ` Goswin von Brederlow
  2013-10-06 10:39 ` Florian Weimer
  2 siblings, 2 replies; 18+ messages in thread
From: Dmitry Grebeniuk @ 2013-09-27 15:08 UTC (permalink / raw)
  To: Yotam Barnoy

Hello.

  Can you please share your experience of writing bindings to some C
libraries for <your preferable language>?  How it compares to  writing
bindings for OCaml?  (this is a thread about runtime values
representation, I suppose.)

  Why will anyone ever need more than 200 constructors of a sum type?
(also note the presence of polymorphic variant types.)

> random access to this data is never needed

  mkay...  :[

^ permalink raw reply	[flat|nested] 18+ messages in thread

[parent not found: <CAN6ygOmuCX6HLfSns0tXQCF3LWMANqhpnSN0vGWcNg0one2QzQ@mail.gmail.com>]

* [Caml-list] Fwd: Proposal: re-design of ocaml headers
       [not found]   ` <CAN6ygOmuCX6HLfSns0tXQCF3LWMANqhpnSN0vGWcNg0one2QzQ@mail.gmail.com>
@ 2013-09-27 15:25     ` Yotam Barnoy
  2013-09-27 16:20       ` Dmitry Grebeniuk
  0 siblings, 1 reply; 18+ messages in thread
From: Yotam Barnoy @ 2013-09-27 15:25 UTC (permalink / raw)
  To: Ocaml Mailing List

[-- Attachment #1: Type: text/plain, Size: 1748 bytes --]

>
>   Can you please share your experience of writing bindings to some C
> libraries for <your preferable language>?  How it compares to  writing
> bindings for OCaml?  (this is a thread about runtime values
> representation, I suppose.)
>
>
This isn't really relevant to this topic, since this discussion is just
about ocaml headers, rather than the ocaml C FFI. The FFI would remain
largely the same.

>   Why will anyone ever need more than 200 constructors of a sum type?
> (also note the presence of polymorphic variant types.)
>

I thought so too, but as mentioned in the previous discussion thread
(entitled 'Expanding the Float Array Tag'), code generators can easily run
out of 200 constructors. However, it's possible that 65,000 constructors is
really excessive, in which case some bits can be shifted around. I'd be
happy to get some feedback on this.

> > random access to this data is never needed
>
>   mkay...  :[
>

This is only talking about automatic ocaml functions, which work without
any type information. Proper ocaml code is typed, so it doesn't need this
information. But internal mechanisms such as the GC, marshal module, and
polymorphic comparison are 'dumb' in the sense that they're missing all of
the crucial type information. For these functions, there needs to be some
basic type information within the data structure itself. Currently, floats
are stored in their own independent data structure with a 'double_tag'
header, or otherwise they're in a 'double_array'. Having some information
in the header about internal floats and pointers is much more efficient,
but it doesn't need to have random access -- the internal ocaml functions
that use this information process the entire data structure at once.

Yotam

[-- Attachment #2: Type: text/html, Size: 2693 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Caml-list] Fwd: Proposal: re-design of ocaml headers
  2013-09-27 15:25     ` [Caml-list] Fwd: " Yotam Barnoy
@ 2013-09-27 16:20       ` Dmitry Grebeniuk
  2013-09-27 18:08         ` Yotam Barnoy
  0 siblings, 1 reply; 18+ messages in thread
From: Dmitry Grebeniuk @ 2013-09-27 16:20 UTC (permalink / raw)
  To: Yotam Barnoy

Hello.

>> (this is a thread about runtime values
>> representation, I suppose.)
> This isn't really relevant to this topic, since this discussion is just
> about ocaml headers, rather than the ocaml C FFI. The FFI would remain
> largely the same.

  There is a lot of bindings have to be rewritten due to these
changes.  You can not automate it with C preprocessor.  What would you
suggest here?

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Caml-list] Fwd: Proposal: re-design of ocaml headers
  2013-09-27 16:20       ` Dmitry Grebeniuk
@ 2013-09-27 18:08         ` Yotam Barnoy
  2013-09-27 18:12           ` Yotam Barnoy
  2013-09-27 18:15           ` Paolo Donadeo
  0 siblings, 2 replies; 18+ messages in thread
From: Yotam Barnoy @ 2013-09-27 18:08 UTC (permalink / raw)
  To: Dmitry Grebeniuk, Ocaml Mailing List

[-- Attachment #1: Type: text/plain, Size: 1258 bytes --]

This is a good point. I'm not that familiar with the bindings in ocaml.
What I can say is that bindings should be written at an abstract enough
level that they don't mess directly with internal bit representations.
Using the ctypes library seems like a good way to go. If a bunch of
libraries' bindings have to be rewritten because the internal runtime
representation has changed... well then maybe that should be done once to
allow them to be abstract enough so that future internal changes don't have
the same impact.

Yotam

On Fri, Sep 27, 2013 at 12:20 PM, Dmitry Grebeniuk <gdsfh1@gmail.com> wrote:

> Hello.
>
> >> (this is a thread about runtime values
> >> representation, I suppose.)
> > This isn't really relevant to this topic, since this discussion is just
> > about ocaml headers, rather than the ocaml C FFI. The FFI would remain
> > largely the same.
>
>   There is a lot of bindings have to be rewritten due to these
> changes.  You can not automate it with C preprocessor.  What would you
> suggest here?
>
> --
> Caml-list mailing list.  Subscription management and archives:
> https://sympa.inria.fr/sympa/arc/caml-list
> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
> Bug reports: http://caml.inria.fr/bin/caml-bugs
>

[-- Attachment #2: Type: text/html, Size: 1981 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Caml-list] Fwd: Proposal: re-design of ocaml headers
  2013-09-27 18:08         ` Yotam Barnoy
@ 2013-09-27 18:12           ` Yotam Barnoy
  2013-09-27 18:15           ` Paolo Donadeo
  1 sibling, 0 replies; 18+ messages in thread
From: Yotam Barnoy @ 2013-09-27 18:12 UTC (permalink / raw)
  To: Dmitry Grebeniuk, Ocaml Mailing List

[-- Attachment #1: Type: text/plain, Size: 5534 bytes --]

I updated the definitions by reducing the number of bits for the tag to 13
(allowing up to 8000 tags, which definitely seems like it should be
enough). I also forgot to insert the 0 tag, which is for arrays, tuples,
records etc. So that's included now:

+ For 16-bit and 32-bit architectures:
     +---------------+----+----+-----+-------+------+
     |     wosize    | ext|cust|noptr| color | tag  |
     +---------------+----+----+-----+-------+------+
bits  31           18  17   16   15   14   13 12   0

- noptr: no pointers present
- ext:  uses extension word
- cust(om): uses custom word. Custom word is normally used to indicate
floats and pointers.
- If both ext and cust are on, the extension word precedes the custom word
in memory.

32 bit extension word (present only if ext is 1)
     +---------------------------------------------+
     |                   wosize                    |
     +---------------------------------------------+
bits  31                                          0

32 bit custom word (default usage - present only if cust is 1):
     +----+----------------------------------------+
     |nofp|              pfbits                    |
     +----+----------------------------------------+
bits   31  30                                     0

- pfbits: indicates which double words are floats and pointers. Starting at
the highest bit:
    - a 0 indicates neither a pointer nor a float
    - a 10 indicates a float (double)
    - a 11 indicates a pointer
    - If noptr is set, each bit indicates a float. If nofp is set, each bit
indicates a pointer.

+ For 64-bit architectures:

     +----------------+--------+----+----+-----+-------+------+
     |     pfbits     | wosize |cust|nofp|noptr| color | tag  |
     +----------------+--------+----+----+-----+-------+------+
bits  63            37 36    18  17   16   15   14   13 12   0

- noptr: a structure with no pointers. All pfbits are used for floats, with
a 1 signifying a float and a 0 signifying a non-float.
- nofp: a structure with no floats. All pfbits are used for pointers, with
a 1 signifying a pointer and a 0 signifying a value.
- If both noptr and nofp are set, wosize is extended to include the pfbits.
It does NOT mean that the structure has no floats and no pointers, only
that the pfbits field is unused.
- cust(om): uses custom double word. Custom double word is normally used to
indicate more floats and pointers, but functionality can change with
certain tags.
    - If the custom bit is set, wosize is expanded to include the pfbits in
the main header.
- pfbits: indicates which double words are floats and pointers. Starting at
the highest bit:
    - a 0 indicates neither a pointer nor a float
    - a 10 indicates a float (double)
    - a 11 indicates a pointer
    - If noptr is set, each bit indicates a float. If nofp is set, each bit
indicates a pointer.

64 bit custom header (default usage indicated - present only if cust is 1):
     +--------------------------------------------------------+
     |                         pfbits                         |
     +--------------------------------------------------------+
bits  63                                                     0

- pfbits: indicates which double words are floats and pointers. Starting at
the highest bit:
    - a 0 indicates neither a pointer nor a float
    - a 10 indicates a float (double)
    - a 11 indicates a pointer
    - If noptr is set, each bit indicates a float. If nofp is set, each bit
indicates a pointer.

+ Tags:
- I think in general it's a good idea to move proprietary tags to the low
end of the spectrum, and add more there if any are needed. This way, if the
tag field is ever expanded, it's not necessary to move these tags again.

- 0: Array, record, tuple tag
- 1: Infix tag (must be 1 mod 4)
- 2: Closure tag
- 3: Lazy tag
- 4: Object tag
- 5: Forward tag
- 6: Abstract tag
- 7: String tag
- 8: Double tag
- 9: Custom tag
- 10: Proposed tag: custom array. Half of custom header is used to indicate
array member size, so one could have an array of tuples, saving both memory
and indirections.
- 100: Start of user tags

-Yotam



On Fri, Sep 27, 2013 at 2:08 PM, Yotam Barnoy <yotambarnoy@gmail.com> wrote:

> This is a good point. I'm not that familiar with the bindings in ocaml.
> What I can say is that bindings should be written at an abstract enough
> level that they don't mess directly with internal bit representations.
> Using the ctypes library seems like a good way to go. If a bunch of
> libraries' bindings have to be rewritten because the internal runtime
> representation has changed... well then maybe that should be done once to
> allow them to be abstract enough so that future internal changes don't have
> the same impact.
>
> Yotam
>
>
> On Fri, Sep 27, 2013 at 12:20 PM, Dmitry Grebeniuk <gdsfh1@gmail.com>wrote:
>
>> Hello.
>>
>> >> (this is a thread about runtime values
>> >> representation, I suppose.)
>> > This isn't really relevant to this topic, since this discussion is just
>> > about ocaml headers, rather than the ocaml C FFI. The FFI would remain
>> > largely the same.
>>
>>   There is a lot of bindings have to be rewritten due to these
>> changes.  You can not automate it with C preprocessor.  What would you
>> suggest here?
>>
>> --
>> Caml-list mailing list.  Subscription management and archives:
>> https://sympa.inria.fr/sympa/arc/caml-list
>> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
>> Bug reports: http://caml.inria.fr/bin/caml-bugs
>>
>
>

[-- Attachment #2: Type: text/html, Size: 7220 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Caml-list] Fwd: Proposal: re-design of ocaml headers
  2013-09-27 18:08         ` Yotam Barnoy
  2013-09-27 18:12           ` Yotam Barnoy
@ 2013-09-27 18:15           ` Paolo Donadeo
  2013-09-27 18:41             ` Yotam Barnoy
  1 sibling, 1 reply; 18+ messages in thread
From: Paolo Donadeo @ 2013-09-27 18:15 UTC (permalink / raw)
  To: OCaml mailing list

On Fri, Sep 27, 2013 at 8:08 PM, Yotam Barnoy <yotambarnoy@gmail.com> wrote:
> What I can say is that bindings should be written at an abstract enough
> level that they don't mess directly with internal bit representations.

Many bindings deal with memory allocation. The binding of Lua deeply
interacts with the garbage collector and inspects the actual type
(tag) of OCaml value passed.

In principle you are right but the reality is that there is no much
"abstraction" in C :-)


-- 
Paolo

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Caml-list] Fwd: Proposal: re-design of ocaml headers
  2013-09-27 18:15           ` Paolo Donadeo
@ 2013-09-27 18:41             ` Yotam Barnoy
  0 siblings, 0 replies; 18+ messages in thread
From: Yotam Barnoy @ 2013-09-27 18:41 UTC (permalink / raw)
  To: Paolo Donadeo; +Cc: OCaml mailing list

[-- Attachment #1: Type: text/plain, Size: 1067 bytes --]

On Fri, Sep 27, 2013 at 2:15 PM, Paolo Donadeo <p.donadeo@gmail.com> wrote:

> Many bindings deal with memory allocation. The binding of Lua deeply
> interacts with the garbage collector and inspects the actual type
> (tag) of OCaml value passed.
>
> In principle you are right but the reality is that there is no much
> "abstraction" in C :-)
>
> It's true that coming from the C side, abstraction is a problem. However,
the kind of bindings you're describing (Lua) essentially inhibit any
improvement to the GC (such as parallelization, which is bound to happen
some day) or to the runtime. It seems like it would be much better if
bindings from C (Lua, for example) called more general functions or
bindings present in the runtime's C code aka an API. Given that it seems
like some bindings currently don't use an API but inspect values closely, I
don't see any solution other than adapting the things that require changes.
Hopefully the majority of C bindings (at least from the ocaml side) can be
moved to ctypes, and can therefore be made more generic.

-Yotam

[-- Attachment #2: Type: text/html, Size: 1520 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Caml-list] Proposal: re-design of ocaml headers
  2013-09-27 15:08 ` Dmitry Grebeniuk
       [not found]   ` <CAN6ygOmuCX6HLfSns0tXQCF3LWMANqhpnSN0vGWcNg0one2QzQ@mail.gmail.com>
@ 2013-09-27 15:31   ` Anthony Tavener
  2013-09-27 15:37     ` Yotam Barnoy
  2013-09-27 16:50     ` Dmitry Grebeniuk
  1 sibling, 2 replies; 18+ messages in thread
From: Anthony Tavener @ 2013-09-27 15:31 UTC (permalink / raw)
  To: Dmitry Grebeniuk; +Cc: Yotam Barnoy

[-- Attachment #1: Type: text/plain, Size: 794 bytes --]

On Fri, Sep 27, 2013 at 9:08 AM, Dmitry Grebeniuk <gdsfh1@gmail.com> wrote:

>
>   Why will anyone ever need more than 200 constructors of a sum type?
> (also note the presence of polymorphic variant types.)
>
>
Back when I read about the limit on constructors I had a moment of worry.
Thankfully the limit is only on non-constant constructors.

I currently have a variant with 292 constructors, but only 30 are
non-constant.
These represent optional characteristics a character may have in a game,
some
with additional payload. I could imagine all of them having their own
payload,
but of course there are other options, like polymorphic variants, or
encoding these
purely as data rather than types.

I wanted to share that as a "be careful of what seems impossible from your
perspective". ;)

[-- Attachment #2: Type: text/html, Size: 1333 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Caml-list] Proposal: re-design of ocaml headers
  2013-09-27 15:31   ` [Caml-list] " Anthony Tavener
@ 2013-09-27 15:37     ` Yotam Barnoy
  2013-09-27 16:50     ` Dmitry Grebeniuk
  1 sibling, 0 replies; 18+ messages in thread
From: Yotam Barnoy @ 2013-09-27 15:37 UTC (permalink / raw)
  To: Anthony Tavener; +Cc: Dmitry Grebeniuk, Yotam Barnoy

[-- Attachment #1: Type: text/plain, Size: 1166 bytes --]

On Fri, Sep 27, 2013 at 11:31 AM, Anthony Tavener <anthony.tavener@gmail.com
> wrote:

>
> On Fri, Sep 27, 2013 at 9:08 AM, Dmitry Grebeniuk <gdsfh1@gmail.com>wrote:
>
>>
>>   Why will anyone ever need more than 200 constructors of a sum type?
>> (also note the presence of polymorphic variant types.)
>>
>>
> Back when I read about the limit on constructors I had a moment of worry.
> Thankfully the limit is only on non-constant constructors.
>
> I currently have a variant with 292 constructors, but only 30 are
> non-constant.
> These represent optional characteristics a character may have in a game,
> some
> with additional payload. I could imagine all of them having their own
> payload,
> but of course there are other options, like polymorphic variants, or
> encoding these
> purely as data rather than types.
>
> I wanted to share that as a "be careful of what seems impossible from your
> perspective". ;)
>

Right. Constant constructors are represented as integers and require no
header, so they're very cheap. And polymorphic variants use more memory, so
they're not a great option. Having the flexibility for more constructors is
a good thing.

Yotam

[-- Attachment #2: Type: text/html, Size: 2041 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Caml-list] Proposal: re-design of ocaml headers
  2013-09-27 15:31   ` [Caml-list] " Anthony Tavener
  2013-09-27 15:37     ` Yotam Barnoy
@ 2013-09-27 16:50     ` Dmitry Grebeniuk
  1 sibling, 0 replies; 18+ messages in thread
From: Dmitry Grebeniuk @ 2013-09-27 16:50 UTC (permalink / raw)
  To: Anthony Tavener

Hello.

  (first, personal: nice to see someone developing games in OCaml!
Wish you good profit here.  Btw, I'd like to see an announcement of
your game here.)

> I currently have a variant with 292 constructors, but only 30 are
> non-constant.
> These represent optional characteristics a character may have in a game,
> some
> with additional payload.

  Things I can think of:

- Are these characteristics "flat", ungroupped, without any common
logic like "health-affecting", "strength-affecting"?  Maybe it would
be nice (w.r.t. the logic) to group them?

- Sum types should be matched fully, each constructor should be
present in "match .. with".  That's a point of sum types and
guarantees about them (there's a warning about it in the compiler,
great thing).  I doubt you usually have such heavy matches.  So if you
don't make use of such kind of guarantees, you can use even exceptions
to encode characteristics.  (polymorphic variants are fine too, of
course.)

> I wanted to share that as a "be careful of what seems impossible from your
> perspective". ;)

  Got it.  But any language has some limits of what can one achieve
with it "out of box and using common ways".  There are other ways to
make things work.  So one of my mottos is "keep caml and curry on".

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Caml-list] Proposal: re-design of ocaml headers
  2013-09-27 14:05 [Caml-list] Proposal: re-design of ocaml headers Yotam Barnoy
  2013-09-27 15:08 ` Dmitry Grebeniuk
@ 2013-09-30 14:48 ` Goswin von Brederlow
  2013-09-30 15:31   ` Yotam Barnoy
  2013-10-06 10:39 ` Florian Weimer
  2 siblings, 1 reply; 18+ messages in thread
From: Goswin von Brederlow @ 2013-09-30 14:48 UTC (permalink / raw)
  To: caml-list

On Fri, Sep 27, 2013 at 10:05:56AM -0400, Yotam Barnoy wrote:
> Following up on the previous thread I started in this general topic, I
> present some more thinking I've done on the redesign of ocaml's headers.
> The purpose of this redesign is to lift the tag number restriction as well
> as size limits on 32-bit platforms. At the same time, header bits can be
> used to indicate floats, allowing cheaper usage of floats in data
> structures, and even to indicate the presence of pointers, making the
> traversal of some data structures by the GC unnecessary.
> 
> The basic idea of this redesign is that most allocations need only a small
> amount of space, but a large number of tags is a necessity. If you're
> allocating a large block of memory (>8KB) then you can spare another word
> for the size.
> 
> The pfbits (as shown below) are an efficient way of representing both
> floats and pointers in a data structure, at the cost of disallowing random
> access. From what I can gather, random access to this data is never needed,
> since the GC, Marshal module, and polymorphic comparison all process the
> whole data structure rather than referring to specific parts of it.
> 
> Open issue: making float representation efficient on the stack.
> 
> 
> + For 16-bit and 32-bit architectures:
>      +---------------+----+----+-----+-------+------+
>      |     wosize    | ext|cust|noptr| color | tag  |
>      +---------------+----+----+-----+-------+------+
> bits  31           21  20   19   18   17   16 15   0
> 
> - noptr: no pointers present
> - ext:  uses extension word
> - cust(om): uses custom word. Custom word is normally used to indicate
> floats and pointers.
> 
> 32 bit extension word (present only if ext is 1)
>      +---------------------------------------------+
>      |                   wosize                    |
>      +---------------------------------------------+
> bits  31                                          0

Why use a full bit for ext? I would define wosize == 0 to mean an
extension word with the actual size is present. That way sizes up to
<16KB can be encoded without extension word.

> 32 bit custom word (default usage - present only if cust is 1):
>      +----+----------------------------------------+
>      |nofp|              pfbits                    |
>      +----+----------------------------------------+
> bits   31  30                                     0
> 
> - nofp: a structure with no floats. All pfbits are used for pointers, with
> a 1 signifying a pointer and a 0 signifying a value.
> - pfbits: indicates which double words are floats and pointers. Starting at
> the highest bit:
>     - a 0 indicates neither a pointer nor a float
>     - a 10 indicates a float (double)
>     - a 11 indicates a pointer
>     - If noptr is set, each bit indicates a float. If nofp is set, each bit
> indicates a pointer.

There are 3 kinds of values:

1) pointers with bit 0 == 0
2) non-pointers with bit 0 == 1
3) floats with all bits used for the type (spanning 2 fields in 32bit)

So if pfbits indicates a float then a field (or 2) is a float and all
bits are used for the value. Otherwise the bit 0 of the field will
tell you wether it is a pointer or not. So why would you want to
duplicate that information in the pfbits?

Or do you want to store untagged/unboxed values in blocks?

That also means that the nofp bit is pointless. If there are no floats
then no custom word is required. And if no custom word is given then the
block must not have any floats.


I do love the noptr bit though. Blocks without pointers are quite
common and int/bool/char arrays will hugely benefit from that.

 
> + For 64-bit architectures:
> 
>      +----------------+--------+----+----+-----+-------+------+
>      |     pfbits     | wosize |cust|nofp|noptr| color | tag  |
>      +----------------+--------+----+----+-----+-------+------+
> bits  63            40 39    21  20   19   18   17   16 15   0

Isn't wosize a bit small there? That limits blocks to 4MiB total, right?
 
> - noptr: a structure with no pointers. All pfbits are used for floats, with
> a 1 signifying a float and a 0 signifying a non-float.
> - nofp: a structure with no floats. All pfbits are used for pointers, with
> a 1 signifying a pointer and a 0 signifying a value.
> - If both noptr and nofp are set, wosize is extended to include the pfbits.
> - cust(om): uses custom double word. Custom double word is normally used to
> indicate more floats and pointers, but functionality can change with
> certain tags.
>     - If the custom bit is set, wosize is expanded to include the pfbits in
> the main header.
> - pfbits: indicates which double words are floats and pointers. Starting at
> the highest bit:
>     - a 0 indicates neither a pointer nor a float
>     - a 10 indicates a float (double)
>     - a 11 indicates a pointer
>     - If noptr is set, each bit indicates a float. If nofp is set, each bit
> indicates a pointer.

As above pointer should not be in the pfbits. So if nofp is set then
wosize would be extended to include pfbits. Same if Double tag is set.
That would allow for larger arrays without floats or only floats. I
guess blocks with a mix of float and non-floats will not be large.
Even for generated code I don't expect constructors or recrods with
more than 500k labels.

> 64 bit custom header (default usage indicated - present only if cust is 1):
>      +--------------------------------------------------------+
>      |                         pfbits                         |
>      +--------------------------------------------------------+
> bits  63                                                     0
> 
> - pfbits: indicates which double words are floats and pointers. Starting at
> the highest bit:
>     - a 0 indicates neither a pointer nor a float
>     - a 10 indicates a float (double)
>     - a 11 indicates a pointer
>     - If noptr is set, each bit indicates a float. If nofp is set, each bit
> indicates a pointer.
> 
> + Tags:
> - I think it's a good idea to move custom tags to the low end of the
> spectrum, and add more there if any are needed. This way, if the tag field
> is ever expanded, it's not necessary to move the custom tags again.
> 
> - 0: Closure tag
> - 1: Infix tag (must be 1 mod 4)
> - 2: Lazy tag
> - 3: Object tag
> - 4: Forward tag
> - 5: Abstract tag
> - 6: String tag
> - 7: Double tag
> - 8: Custom tag
> - 9: Proposed tag: custom array. Half of custom header is used to indicate
> array member size, so one could have an array of tuples, saving both memory
> and indirections.
> - 100: Start of user tags
> 
> Yotam Barnoy

It might be nice to support C values like untagged ints or unaligned
pointers. If Custom tag is set then the pfbits become ocaml value
bits. The GC will only inspect fields with pfbit set. All other fields
are ignored. The custom_operations handle compare, hash, serialize and
deserialize so nothing else will access the data.

Another thing are int32 and int64. I guess if you want to unbox those
then having 2 bits per field in pfbits makes sense again. But then I
would allocate them as:

    - a 00 indicates a tagged value (int or pointer)
    - a 01 indicates a non-pointer: int, int32, native int, C pointer
    - a 10 indicates a float (double)
    - a 11 indicates an int64

The higher bit would indicate a 64bit value, meaning spanning 2 fields
on 32bit. Not that those 4 values allow mixing ocaml values, C values,
int32, int64 and float in a block.

I would combine the noptr and nofp bits into a single 2bit field:

    - a 00 indicates no pointers and no double size, no pfbits
    - a 01 indicates no double size, pfbits indicate tagged / non-pointer
    - a 10 indicates no pointers but double size, pfbits indicate size
    - a 11 indicates both pointers and double size, 2 pfbits per field

Note: tagged integers can be stored as 00 or 01. I think this would be
required for polymorphic types. An 'a could be int or pointer. In both
cases 00 will work.

MfG
	Goswin

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Caml-list] Proposal: re-design of ocaml headers
  2013-09-30 14:48 ` Goswin von Brederlow
@ 2013-09-30 15:31   ` Yotam Barnoy
  2013-10-08 10:52     ` Goswin von Brederlow
  0 siblings, 1 reply; 18+ messages in thread
From: Yotam Barnoy @ 2013-09-30 15:31 UTC (permalink / raw)
  To: Goswin von Brederlow; +Cc: Ocaml Mailing List

[-- Attachment #1: Type: text/plain, Size: 4829 bytes --]

On Mon, Sep 30, 2013 at 10:48 AM, Goswin von Brederlow <goswin-v-b@web.de>wrote:

> >
> > + For 16-bit and 32-bit architectures:
> >      +---------------+----+----+-----+-------+------+
> >      |     wosize    | ext|cust|noptr| color | tag  |
> >      +---------------+----+----+-----+-------+------+
> > bits  31           21  20   19   18   17   16 15   0
> >
> > - noptr: no pointers present
> > - ext:  uses extension word
> > - cust(om): uses custom word. Custom word is normally used to indicate
> > floats and pointers.
> >
> > 32 bit extension word (present only if ext is 1)
> >      +---------------------------------------------+
> >      |                   wosize                    |
> >      +---------------------------------------------+
> > bits  31                                          0
>
> Why use a full bit for ext? I would define wosize == 0 to mean an
> extension word with the actual size is present. That way sizes up to
> <16KB can be encoded without extension word.
>
>
Great point! Of course, that makes perfect sense. I was feeling like I was
wasting the wosize bits with the extension word but couldn't quite get put
2 and 2 together.
BTW, down the thread is a newer version of the design that reduces the tag
space to 8000 tags, which I do think is sufficient.



>  > 32 bit custom word (default usage - present only if cust is 1):
> >      +----+----------------------------------------+
> >      |nofp|              pfbits                    |
> >      +----+----------------------------------------+
> > bits   31  30                                     0
> >
> > - nofp: a structure with no floats. All pfbits are used for pointers,
> with
> > a 1 signifying a pointer and a 0 signifying a value.
> > - pfbits: indicates which double words are floats and pointers. Starting
> at
> > the highest bit:
> >     - a 0 indicates neither a pointer nor a float
> >     - a 10 indicates a float (double)
> >     - a 11 indicates a pointer
> >     - If noptr is set, each bit indicates a float. If nofp is set, each
> bit
> > indicates a pointer.
>
> There are 3 kinds of values:
>
> 1) pointers with bit 0 == 0
> 2) non-pointers with bit 0 == 1
> 3) floats with all bits used for the type (spanning 2 fields in 32bit)
>
> So if pfbits indicates a float then a field (or 2) is a float and all
> bits are used for the value. Otherwise the bit 0 of the field will
> tell you wether it is a pointer or not. So why would you want to
> duplicate that information in the pfbits?
>

I was thinking of doing it for efficiency. If we're already indicating
what's what, we might as well represent shortcuts to the pointers, which
would cut down on the amount of reading, no? In the average case, the GC
would need to access a lot less memory.


> It might be nice to support C values like untagged ints or unaligned
> pointers. If Custom tag is set then the pfbits become ocaml value
> bits. The GC will only inspect fields with pfbit set. All other fields
> are ignored. The custom_operations handle compare, hash, serialize and
> deserialize so nothing else will access the data.
>
> Another thing are int32 and int64. I guess if you want to unbox those
> then having 2 bits per field in pfbits makes sense again. But then I
> would allocate them as:
>
>     - a 00 indicates a tagged value (int or pointer)
>     - a 01 indicates a non-pointer: int, int32, native int, C pointer
>     - a 10 indicates a float (double)
>     - a 11 indicates an int64
>
> The higher bit would indicate a 64bit value, meaning spanning 2 fields
> on 32bit. Not that those 4 values allow mixing ocaml values, C values,
> int32, int64 and float in a block.
>
> I would combine the noptr and nofp bits into a single 2bit field:
>
>     - a 00 indicates no pointers and no double size, no pfbits
>     - a 01 indicates no double size, pfbits indicate tagged / non-pointer
>     - a 10 indicates no pointers but double size, pfbits indicate size
>     - a 11 indicates both pointers and double size, 2 pfbits per field
>
> Note: tagged integers can be stored as 00 or 01. I think this would be
> required for polymorphic types. An 'a could be int or pointer. In both
> cases 00 will work.
>
>
I really like this idea -- unboxing more types could be really useful. I'm
not sure double 'size' would work, however. It should be fine for the
marshal module, but polymorphic comparison would get messed up because
floats have to be compared differently. So I think 10 in the bit field
should indicate no pointers but floats, while 11 could allow both pointers
and double size, with the 2-bits specifying if it's a float or an int64 (as
you've outlined). Of course, one cannot have both shortcuts to pointers and
enhanced unboxing, so let me know what you think about the performance
increase from shortcutting the tag bit.

Yotam

[-- Attachment #2: Type: text/html, Size: 5931 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Caml-list] Proposal: re-design of ocaml headers
  2013-09-30 15:31   ` Yotam Barnoy
@ 2013-10-08 10:52     ` Goswin von Brederlow
  2013-10-11 15:48       ` Yotam Barnoy
  0 siblings, 1 reply; 18+ messages in thread
From: Goswin von Brederlow @ 2013-10-08 10:52 UTC (permalink / raw)
  To: Yotam Barnoy; +Cc: Ocaml Mailing List

On Mon, Sep 30, 2013 at 11:31:23AM -0400, Yotam Barnoy wrote:
> On Mon, Sep 30, 2013 at 10:48 AM, Goswin von Brederlow <goswin-v-b@web.de>wrote:
> 
> > >
> > > + For 16-bit and 32-bit architectures:
> > >      +---------------+----+----+-----+-------+------+
> > >      |     wosize    | ext|cust|noptr| color | tag  |
> > >      +---------------+----+----+-----+-------+------+
> > > bits  31           21  20   19   18   17   16 15   0
> > >
> > > - noptr: no pointers present
> > > - ext:  uses extension word
> > > - cust(om): uses custom word. Custom word is normally used to indicate
> > > floats and pointers.
> > >
> > > 32 bit extension word (present only if ext is 1)
> > >      +---------------------------------------------+
> > >      |                   wosize                    |
> > >      +---------------------------------------------+
> > > bits  31                                          0
> >
> > Why use a full bit for ext? I would define wosize == 0 to mean an
> > extension word with the actual size is present. That way sizes up to
> > <16KB can be encoded without extension word.
> >
> >
> Great point! Of course, that makes perfect sense. I was feeling like I was
> wasting the wosize bits with the extension word but couldn't quite get put
> 2 and 2 together.
> BTW, down the thread is a newer version of the design that reduces the tag
> space to 8000 tags, which I do think is sufficient.
> 
> 
> 
> >  > 32 bit custom word (default usage - present only if cust is 1):
> > >      +----+----------------------------------------+
> > >      |nofp|              pfbits                    |
> > >      +----+----------------------------------------+
> > > bits   31  30                                     0
> > >
> > > - nofp: a structure with no floats. All pfbits are used for pointers,
> > with
> > > a 1 signifying a pointer and a 0 signifying a value.
> > > - pfbits: indicates which double words are floats and pointers. Starting
> > at
> > > the highest bit:
> > >     - a 0 indicates neither a pointer nor a float
> > >     - a 10 indicates a float (double)
> > >     - a 11 indicates a pointer
> > >     - If noptr is set, each bit indicates a float. If nofp is set, each
> > bit
> > > indicates a pointer.
> >
> > There are 3 kinds of values:
> >
> > 1) pointers with bit 0 == 0
> > 2) non-pointers with bit 0 == 1
> > 3) floats with all bits used for the type (spanning 2 fields in 32bit)
> >
> > So if pfbits indicates a float then a field (or 2) is a float and all
> > bits are used for the value. Otherwise the bit 0 of the field will
> > tell you wether it is a pointer or not. So why would you want to
> > duplicate that information in the pfbits?
> >
> 
> I was thinking of doing it for efficiency. If we're already indicating
> what's what, we might as well represent shortcuts to the pointers, which
> would cut down on the amount of reading, no? In the average case, the GC
> would need to access a lot less memory.
> 
> 
> > It might be nice to support C values like untagged ints or unaligned
> > pointers. If Custom tag is set then the pfbits become ocaml value
> > bits. The GC will only inspect fields with pfbit set. All other fields
> > are ignored. The custom_operations handle compare, hash, serialize and
> > deserialize so nothing else will access the data.
> >
> > Another thing are int32 and int64. I guess if you want to unbox those
> > then having 2 bits per field in pfbits makes sense again. But then I
> > would allocate them as:
> >
> >     - a 00 indicates a tagged value (int or pointer)
> >     - a 01 indicates a non-pointer: int, int32, native int, C pointer
> >     - a 10 indicates a float (double)
> >     - a 11 indicates an int64
> >
> > The higher bit would indicate a 64bit value, meaning spanning 2 fields
> > on 32bit. Not that those 4 values allow mixing ocaml values, C values,
> > int32, int64 and float in a block.
> >
> > I would combine the noptr and nofp bits into a single 2bit field:
> >
> >     - a 00 indicates no pointers and no double size, no pfbits
> >     - a 01 indicates no double size, pfbits indicate tagged / non-pointer
> >     - a 10 indicates no pointers but double size, pfbits indicate size
> >     - a 11 indicates both pointers and double size, 2 pfbits per field
> >
> > Note: tagged integers can be stored as 00 or 01. I think this would be
> > required for polymorphic types. An 'a could be int or pointer. In both
> > cases 00 will work.
> >
> >
> I really like this idea -- unboxing more types could be really useful. I'm
> not sure double 'size' would work, however. It should be fine for the
> marshal module, but polymorphic comparison would get messed up because
> floats have to be compared differently. So I think 10 in the bit field
> should indicate no pointers but floats, while 11 could allow both pointers
> and double size, with the 2-bits specifying if it's a float or an int64 (as
> you've outlined). Of course, one cannot have both shortcuts to pointers and
> enhanced unboxing, so let me know what you think about the performance
> increase from shortcutting the tag bit.
> 
> Yotam

Lets look at an example:

type 'a r = { a:int; b:float; c:int32; d:int64; e:'a; }

For 16-bit and 32-bit architectures:
     +--------------------+----------+-------+------+
     |     wosize         |pfbit type| color | tag  |
     +--------------------+----------+-------+------+
bits  31               20   19   18   17   16 15   0

wosize = 7
pfbit type = 11 (pointers and double size)

     +------------------------------+--+--+--+--+--+
     |                   pfbits     |00|11|01|10|01|
     +------------------------------+--+--+--+--+--+
                                      e  d  c  b  a

The GC only needs to check e since 'a might be a pointer. All other fields
are marked as non pointer.

Comparison does a plain bit comparison on a, c and d, a float
comparison on b and a tagged comparison on e. Similar for marshaling.
There is no confusion between int64 and floats.

MfG
	Goswin

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Caml-list] Proposal: re-design of ocaml headers
  2013-10-08 10:52     ` Goswin von Brederlow
@ 2013-10-11 15:48       ` Yotam Barnoy
  2014-01-30 20:53         ` Yotam Barnoy
  2014-02-01 15:27         ` Goswin von Brederlow
  0 siblings, 2 replies; 18+ messages in thread
From: Yotam Barnoy @ 2013-10-11 15:48 UTC (permalink / raw)
  To: Goswin von Brederlow; +Cc: Ocaml Mailing List

[-- Attachment #1: Type: text/plain, Size: 10282 bytes --]

I had an idea based on (or actually copied from) something Goswin mentioned
earlier in our discussion. What if the bits can be used to indicate
embedded values? An embedded value would have a header inside the body of
the parent value, making it possible to get rid of all pointers within that
parent. This would represent a potentially large saving for the GC, as well
as reducing random jumps around memory which are very cache-stressing.

In the spirit of this idea, here is my latest version:

+ For 16-bit and 32-bit architectures:
     +---------------+-------------+-----+-------+------+
     |       wosize  |pfbits type  |noptr| color | tag  |
     +---------------+-------------+-----+-------+------+
bits  31           19  18   17   16   15  14   13 12   0

pfbits type:
- 000: no pfbits
- 001: pfbits indicate embedded header
- 010: pfbits indicate float
- 011: pfbits indicate untagged type
- 100: pfbits indicate int64
- 101: pfbits indicate float/embdedded header/untagged type, 2 bits per
field
- 110: pfbits indicate float/untagged type/int64, 2 bits per field
- 111: pfbits indicate embedded header/untagged type/int64, 2 bits per field

- noptr: no pointers present
- if wosize = 0, the extension word is used for wosize
- if both wosize = 0 and the pfbits are used, the wosize_large is first in
memory

wosize_large word (if wosize is 0 in the header)
     +---------------------------------------------+
     |                   wosize                    |
     +---------------------------------------------+
bits  31                                          0

32 bit pfbits word (present only if called for by pfbits type in header)
     +---------------------------------------------+
     |                   pfbits                    |
     +---------------------------------------------+
bits   31  30                                     0

- pfbits:
    - If working in 1 bit mode, each bit represents whatever is signaled in
the pfbits type field.
    - If working in 2 bit mode, 00 always represents a regular word, and
01, 10, 11 represent their type as signaled in the pfbit type field.



+ For 64-bit architectures:

     +-------------+--------+----------+---+------+-------+------+
     |     pfbits  | wosize |pfbit type|exp| noptr| color | tag  |
     +-------------+--------+----------+---+------+-------+------+
bits  63         40 39    20 19      17 16    15   14   13 12   0


- noptr: a structure with no pointers.
- pfbits: a small pfbits field for smaller objects
- pfbits type: (slightly different than 32-bit architecture)
    - 000: no pfbits, wosize includes pfbits as its upper bits
    - 001: pfbits indicate embedded header
    - 010: pfbits indicate float
    - 011: pfbits indicate untagged type
    - 100: pfbits indicate int64
    - 101: pfbits indicate float/untagged type/embedded header, 2 bits per
field
    - 110: pfbits indicate float/untagged type/int64, 2 bits per field
    - 111: pfbits indicate int64/untagged type/embedded header, 2 bits per
field
- exp: use pfbits_expanded for signaling pfbits. Pfbits in header become
top bits of wosize.


     +--------------------------------------------------------+
     |                pfbits_expanded                         |
     +--------------------------------------------------------+
bits  63                                                     0

- pfbits_expanded: if exp is set, pfbits_expanded takes the place of the
pfbits. wo_size is joined with the pfbits in the header.

+ Tags:

- 0: Array, record, tuple tag
- 1: Infix tag (must be 1 mod 4)
- 2: Closure tag
- 3: Lazy tag
- 4: Object tag
- 5: Forward tag
- 6: Abstract tag
- 7: String tag
- 8: Double tag
- 9: Custom tag
- 10: Double_array tag
- 11: Proposed: Int32_array tag
- 12: Proposed: Int64_array tag
- 13: Proposed: Cptr_array tag
- 14: Proposed: float32_array tag
- 1000: first user tag

-Yotam


On Tue, Oct 8, 2013 at 6:52 AM, Goswin von Brederlow <goswin-v-b@web.de>wrote:

> On Mon, Sep 30, 2013 at 11:31:23AM -0400, Yotam Barnoy wrote:
> > On Mon, Sep 30, 2013 at 10:48 AM, Goswin von Brederlow <
> goswin-v-b@web.de>wrote:
> >
> > > >
> > > > + For 16-bit and 32-bit architectures:
> > > >      +---------------+----+----+-----+-------+------+
> > > >      |     wosize    | ext|cust|noptr| color | tag  |
> > > >      +---------------+----+----+-----+-------+------+
> > > > bits  31           21  20   19   18   17   16 15   0
> > > >
> > > > - noptr: no pointers present
> > > > - ext:  uses extension word
> > > > - cust(om): uses custom word. Custom word is normally used to
> indicate
> > > > floats and pointers.
> > > >
> > > > 32 bit extension word (present only if ext is 1)
> > > >      +---------------------------------------------+
> > > >      |                   wosize                    |
> > > >      +---------------------------------------------+
> > > > bits  31                                          0
> > >
> > > Why use a full bit for ext? I would define wosize == 0 to mean an
> > > extension word with the actual size is present. That way sizes up to
> > > <16KB can be encoded without extension word.
> > >
> > >
> > Great point! Of course, that makes perfect sense. I was feeling like I
> was
> > wasting the wosize bits with the extension word but couldn't quite get
> put
> > 2 and 2 together.
> > BTW, down the thread is a newer version of the design that reduces the
> tag
> > space to 8000 tags, which I do think is sufficient.
> >
> >
> >
> > >  > 32 bit custom word (default usage - present only if cust is 1):
> > > >      +----+----------------------------------------+
> > > >      |nofp|              pfbits                    |
> > > >      +----+----------------------------------------+
> > > > bits   31  30                                     0
> > > >
> > > > - nofp: a structure with no floats. All pfbits are used for pointers,
> > > with
> > > > a 1 signifying a pointer and a 0 signifying a value.
> > > > - pfbits: indicates which double words are floats and pointers.
> Starting
> > > at
> > > > the highest bit:
> > > >     - a 0 indicates neither a pointer nor a float
> > > >     - a 10 indicates a float (double)
> > > >     - a 11 indicates a pointer
> > > >     - If noptr is set, each bit indicates a float. If nofp is set,
> each
> > > bit
> > > > indicates a pointer.
> > >
> > > There are 3 kinds of values:
> > >
> > > 1) pointers with bit 0 == 0
> > > 2) non-pointers with bit 0 == 1
> > > 3) floats with all bits used for the type (spanning 2 fields in 32bit)
> > >
> > > So if pfbits indicates a float then a field (or 2) is a float and all
> > > bits are used for the value. Otherwise the bit 0 of the field will
> > > tell you wether it is a pointer or not. So why would you want to
> > > duplicate that information in the pfbits?
> > >
> >
> > I was thinking of doing it for efficiency. If we're already indicating
> > what's what, we might as well represent shortcuts to the pointers, which
> > would cut down on the amount of reading, no? In the average case, the GC
> > would need to access a lot less memory.
> >
> >
> > > It might be nice to support C values like untagged ints or unaligned
> > > pointers. If Custom tag is set then the pfbits become ocaml value
> > > bits. The GC will only inspect fields with pfbit set. All other fields
> > > are ignored. The custom_operations handle compare, hash, serialize and
> > > deserialize so nothing else will access the data.
> > >
> > > Another thing are int32 and int64. I guess if you want to unbox those
> > > then having 2 bits per field in pfbits makes sense again. But then I
> > > would allocate them as:
> > >
> > >     - a 00 indicates a tagged value (int or pointer)
> > >     - a 01 indicates a non-pointer: int, int32, native int, C pointer
> > >     - a 10 indicates a float (double)
> > >     - a 11 indicates an int64
> > >
> > > The higher bit would indicate a 64bit value, meaning spanning 2 fields
> > > on 32bit. Not that those 4 values allow mixing ocaml values, C values,
> > > int32, int64 and float in a block.
> > >
> > > I would combine the noptr and nofp bits into a single 2bit field:
> > >
> > >     - a 00 indicates no pointers and no double size, no pfbits
> > >     - a 01 indicates no double size, pfbits indicate tagged /
> non-pointer
> > >     - a 10 indicates no pointers but double size, pfbits indicate size
> > >     - a 11 indicates both pointers and double size, 2 pfbits per field
> > >
> > > Note: tagged integers can be stored as 00 or 01. I think this would be
> > > required for polymorphic types. An 'a could be int or pointer. In both
> > > cases 00 will work.
> > >
> > >
> > I really like this idea -- unboxing more types could be really useful.
> I'm
> > not sure double 'size' would work, however. It should be fine for the
> > marshal module, but polymorphic comparison would get messed up because
> > floats have to be compared differently. So I think 10 in the bit field
> > should indicate no pointers but floats, while 11 could allow both
> pointers
> > and double size, with the 2-bits specifying if it's a float or an int64
> (as
> > you've outlined). Of course, one cannot have both shortcuts to pointers
> and
> > enhanced unboxing, so let me know what you think about the performance
> > increase from shortcutting the tag bit.
> >
> > Yotam
>
> Lets look at an example:
>
> type 'a r = { a:int; b:float; c:int32; d:int64; e:'a; }
>
> For 16-bit and 32-bit architectures:
>      +--------------------+----------+-------+------+
>      |     wosize         |pfbit type| color | tag  |
>      +--------------------+----------+-------+------+
> bits  31               20   19   18   17   16 15   0
>
> wosize = 7
> pfbit type = 11 (pointers and double size)
>
>      +------------------------------+--+--+--+--+--+
>      |                   pfbits     |00|11|01|10|01|
>      +------------------------------+--+--+--+--+--+
>                                       e  d  c  b  a
>
> The GC only needs to check e since 'a might be a pointer. All other fields
> are marked as non pointer.
>
> Comparison does a plain bit comparison on a, c and d, a float
> comparison on b and a tagged comparison on e. Similar for marshaling.
> There is no confusion between int64 and floats.
>
> MfG
>         Goswin
>

[-- Attachment #2: Type: text/html, Size: 12300 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Caml-list] Proposal: re-design of ocaml headers
  2013-10-11 15:48       ` Yotam Barnoy
@ 2014-01-30 20:53         ` Yotam Barnoy
  2014-02-01 15:27         ` Goswin von Brederlow
  1 sibling, 0 replies; 18+ messages in thread
From: Yotam Barnoy @ 2014-01-30 20:53 UTC (permalink / raw)
  To: Goswin von Brederlow; +Cc: Ocaml Mailing List

[-- Attachment #1: Type: text/plain, Size: 5185 bytes --]

I'm resurrecting this thread.

Given the recent discussion about optimization, I've done some more
thinking and simplification work on my proposal for a better ocaml header.
Much of this is just adopting Goswin's proposal, which made a lot of sense.
I tried to also reduce the variations while allowing for a lot of useful
unboxing of floats, int64s etc. I've also tried to tackle tuples in this
version -- basically, in order to be compatible with polymorphic functions,
you really want tuples and all other polymorphic types (as in 'a embedded
inside records) to be of uniform size. Expanding that size under some
circumstances to 64 bits on 32 bit platforms makes sense if, for example,
the tuple contains many floats. I don't think handling both what I call
narrow and wide polymorphic types (for lack of a better name) will make the
code much more complex -- I certainly don't want it to result in heavy
register spilling.

Also, the dynamic parts of the header now appear BEFORE the header itself.
Their highest bit is 0 to indicate that they're not the main header. This
will be seen by GC code, which can afford to do the few extra comparisons.
Mutator code will always have the main header available right before the
data.


So here it is:


+ For 32-bit architectures:
---------------------------

Wide Types
----------
- For polymorphic types (containing 'a), member sizes need to be uniform
for speed. Tuples are fully polymorphic. These polymorphic members can
either be narrow (32 bit) as they are now or wide (64 bit) to accomodate
float/int64 unboxing.

     +-----------+--------------+------+-----+-------+------+
     | 1 | wosize|    fbits     |ebits |noptr| color | tag  |
     +-----------+--------------+------+-----+-------+------+
bits  31  30   21 20          15   14     13  12   11 10   0

- noptr: no pointers present.
- fbits: wide types cannot be represented unless extbits is used
    - 00: tagged (int/pointer)
    - 01: int32/native int, C pointer
    - 10: float
    - 11: int64

- if wosize = 0, the size word is used for wosize

size word
---------
- only present if wosize is 0 in the header. Precedes the main header
     +---------------------------------------------+
     | 0 |               wosize                    |
     +---------------------------------------------+
bits  31  30                                      0

- bit 31 is used to identify this as a header extension


ext word
--------
- Only present if ebits is 1. Describes the first 15 members of the object
(+ 3 members from fbits)
     +----------+----------------------------------+
     | 0 | wide |        extbits                   |
     +----------+----------------------------------+
bits   31  30     29                              0

- bit 31 is used to identify this as a header extension
- wide: determines if 'a types in the first 18 words are wide (64 bit) or
narrow (32 bits)
- extbits: same as fbits


Tuples
------
- Tuples with any fbits on in the header word are automatically wide tuples
(64 bit) for the first 3 words
- Tuples with ebits are automatically wide tuples for the first 18 words.
wide is ignored.

Strings
-------
- In strings, the last fbit, ebits and noptr function as the string size
modifier (currently present at the end of the string). This improves cache
locality on large strings. Wosize expands to include the fbits.

Arrays
------
- Arrays of integers, int64, floats, C pointers etc can all be handled with
the regular array type.
- Only 2 lower fbits are needed for the type. The other fbits are joined
with wosize.


+ 64-bit architectures:
--------------------------

     +-----------+------------------+------+-----+-------+------+
     | 1 | wosize|    fbits         |ebits |noptr| color | tag  |
     +-----------+------------------+------+-----+-------+------+
bits  63  62   43 42              15   14     13  12   11 10   0


- noptr: a structure with no pointers.
- fbits: 2 bits per object member
    - 00: tagged (int/pointer)
    - 01: int32
    - 10: float
    - 11: int64/native int/C pointer

- ebits: use ext word for signaling bits. fbits in header become bottom
bits of wosize.


ext_word
--------
- Only present if ebits is 1. Describes the first 31 members of the object.
     +---------------------------------------------+
     | 0 | - |           extbits                   |
     +---------------------------------------------+
bits   63  62  61                                  0

- bit 31 is used to identify this as a header extension
- extbits: same as fbits

Strings
-------
- In strings, the last fbit, ebits and noptr function as the string size
modifier (currently present at the end of the string). This improves cache
locality on large strings. Wosize expands

Arrays
------
- Only the 2 lowest fbits are needed to discriminate the type. All other
fbits are joined to wosize.

+ Tags:
-------
- 0: Array tag
- 1: Record tag
- 2: Tuple tag
- 3: Infix tag
- 4: Closure tag
- 5: Lazy tag
- 6: Object tag
- 7: Forward tag
- 8: Abstract tag
- 9: String tag (pfbit type used for size completion)
- 10: Primitive value tag (double, int64, int32)
- 11: Custom tag
- 100: First user tag

Comments?
Yotam

[-- Attachment #2: Type: text/html, Size: 6012 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Caml-list] Proposal: re-design of ocaml headers
  2013-10-11 15:48       ` Yotam Barnoy
  2014-01-30 20:53         ` Yotam Barnoy
@ 2014-02-01 15:27         ` Goswin von Brederlow
  1 sibling, 0 replies; 18+ messages in thread
From: Goswin von Brederlow @ 2014-02-01 15:27 UTC (permalink / raw)
  To: Yotam Barnoy; +Cc: Ocaml Mailing List

On Fri, Oct 11, 2013 at 11:48:52AM -0400, Yotam Barnoy wrote:
> I had an idea based on (or actually copied from) something Goswin mentioned
> earlier in our discussion. What if the bits can be used to indicate
> embedded values? An embedded value would have a header inside the body of
> the parent value, making it possible to get rid of all pointers within that
> parent. This would represent a potentially large saving for the GC, as well
> as reducing random jumps around memory which are very cache-stressing.

Embedded values would be something like this:

type p2 = { x : int; y : int; }
type p3 = { xy : embedded p2; z : int; }

The type p3 would still be a single block and p3.xy.x would not need
an extra indirection.

Right?

I'm not sure this would be too usefull overall. In the above example
it would be far better to have a row type like in objects. Something
like

type p3 = { p2 with z : int; }

That would even allow using p3.x instead of p3.xy.x.

It would also allow passing a p3 record to a function expecting a p2
record. With embedded types that would not be possible. An embedded
type would have to be passed to other functions as pointer to the
begining of the parent and offset within.

Another thing is, as recently mentioned, that the compiler aparently
aggregates allocations. So if you write

let p2 = { x = 1; y = 2; } in
let p3 = { xy = p2; z = 3; } in

that results in a single allocation and p2 and p3 will be cache local.
I think they will even stay next to each other during compression.
Right?

Embedding would get rid of the xy pointer and indirection but would it
help so much? Would it be used widely?

Given that embedded and not embedded types are incompatible I think
this would rather be limited to stay within modules and never cross
the module boundary. For example an internal hashtable would make use
of it. And why have extra headers for the embedded values? Incorporate
them into the main header directly and make

type p3 = { xy : embedded p2; z : int; }

equivalent to

type p3 = { xy.x : int; xy.y: int; z : int; }

A block containing 3 values. Embedding extra headers would only be
usefull when embedding largish float arrays (which don't need a bit
for every member) into large records. And then why not use the
indifection? Given then size it probably will be negible. And for
small structures use the individual bits to mark every float member
independently.

So overally I'm doubtfull the added complexity to recurse into
embedded headers is worth it.

MfG
	Goswin

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Caml-list] Proposal: re-design of ocaml headers
  2013-09-27 14:05 [Caml-list] Proposal: re-design of ocaml headers Yotam Barnoy
  2013-09-27 15:08 ` Dmitry Grebeniuk
  2013-09-30 14:48 ` Goswin von Brederlow
@ 2013-10-06 10:39 ` Florian Weimer
  2 siblings, 0 replies; 18+ messages in thread
From: Florian Weimer @ 2013-10-06 10:39 UTC (permalink / raw)
  To: Yotam Barnoy; +Cc: Ocaml Mailing List

* Yotam Barnoy:

> The pfbits (as shown below) are an efficient way of representing both
> floats and pointers in a data structure, at the cost of disallowing random
> access. From what I can gather, random access to this data is never needed,
> since the GC, Marshal module, and polymorphic comparison all process the
> whole data structure rather than referring to specific parts of it.

I wonder if it could make sense to get rid of polymorphic comparison
and make Marshal type-safe before changing the header.  If the header
is needed for GC only and for telling variants apart, it would not be
necessary to discriminate between strings, floats and custom memory
blocks in the header.  That would give room for 28 bits for the
tag/size combination or the string or value array length, with 2 bits
used by the GC colors and 2 bits for the type bits (telling apart
non-scanned blobs, scanned arrays of values, scanned tagged variants,
and unscanned tagged variants).  Arrays are separate, so 28 bits
should be plenty of room for both tag and variant record length.

This way, it would be possible to encode both string and array lengths
directly in the header, speeding up bounds checks.  The GC would have
check the type bits to compute the actual object size from the length
in the header, but mutator performance seems more important, and
perhaps we can come up with a clever way to decode the length without
branching.  It might even make sense to store the actual number of
elements of a double array in the header, again to speed up bounds
checks.

If the 32-bit 256 MB string size limit is too onerous, it should be
possible to bump it to 512 MB, but that will complicate header
decoding further.  But it's possible to index only 1 GB, so 256 MB
is already pretty close to that hard limit.

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2014-02-01 15:27 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-09-27 14:05 [Caml-list] Proposal: re-design of ocaml headers Yotam Barnoy
2013-09-27 15:08 ` Dmitry Grebeniuk
     [not found]   ` <CAN6ygOmuCX6HLfSns0tXQCF3LWMANqhpnSN0vGWcNg0one2QzQ@mail.gmail.com>
2013-09-27 15:25     ` [Caml-list] Fwd: " Yotam Barnoy
2013-09-27 16:20       ` Dmitry Grebeniuk
2013-09-27 18:08         ` Yotam Barnoy
2013-09-27 18:12           ` Yotam Barnoy
2013-09-27 18:15           ` Paolo Donadeo
2013-09-27 18:41             ` Yotam Barnoy
2013-09-27 15:31   ` [Caml-list] " Anthony Tavener
2013-09-27 15:37     ` Yotam Barnoy
2013-09-27 16:50     ` Dmitry Grebeniuk
2013-09-30 14:48 ` Goswin von Brederlow
2013-09-30 15:31   ` Yotam Barnoy
2013-10-08 10:52     ` Goswin von Brederlow
2013-10-11 15:48       ` Yotam Barnoy
2014-01-30 20:53         ` Yotam Barnoy
2014-02-01 15:27         ` Goswin von Brederlow
2013-10-06 10:39 ` Florian Weimer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox