* [Caml-list] Proposal: re-design of ocaml headers @ 2013-09-27 14:05 Yotam Barnoy 2013-09-27 15:08 ` Dmitry Grebeniuk ` (2 more replies) 0 siblings, 3 replies; 18+ messages in thread From: Yotam Barnoy @ 2013-09-27 14:05 UTC (permalink / raw) To: Ocaml Mailing List [-- Attachment #1: Type: text/plain, Size: 4946 bytes --] Following up on the previous thread I started in this general topic, I present some more thinking I've done on the redesign of ocaml's headers. The purpose of this redesign is to lift the tag number restriction as well as size limits on 32-bit platforms. At the same time, header bits can be used to indicate floats, allowing cheaper usage of floats in data structures, and even to indicate the presence of pointers, making the traversal of some data structures by the GC unnecessary. The basic idea of this redesign is that most allocations need only a small amount of space, but a large number of tags is a necessity. If you're allocating a large block of memory (>8KB) then you can spare another word for the size. The pfbits (as shown below) are an efficient way of representing both floats and pointers in a data structure, at the cost of disallowing random access. From what I can gather, random access to this data is never needed, since the GC, Marshal module, and polymorphic comparison all process the whole data structure rather than referring to specific parts of it. Open issue: making float representation efficient on the stack. + For 16-bit and 32-bit architectures: +---------------+----+----+-----+-------+------+ | wosize | ext|cust|noptr| color | tag | +---------------+----+----+-----+-------+------+ bits 31 21 20 19 18 17 16 15 0 - noptr: no pointers present - ext: uses extension word - cust(om): uses custom word. Custom word is normally used to indicate floats and pointers. 32 bit extension word (present only if ext is 1) +---------------------------------------------+ | wosize | +---------------------------------------------+ bits 31 0 32 bit custom word (default usage - present only if cust is 1): +----+----------------------------------------+ |nofp| pfbits | +----+----------------------------------------+ bits 31 30 0 - nofp: a structure with no floats. All pfbits are used for pointers, with a 1 signifying a pointer and a 0 signifying a value. - pfbits: indicates which double words are floats and pointers. Starting at the highest bit: - a 0 indicates neither a pointer nor a float - a 10 indicates a float (double) - a 11 indicates a pointer - If noptr is set, each bit indicates a float. If nofp is set, each bit indicates a pointer. + For 64-bit architectures: +----------------+--------+----+----+-----+-------+------+ | pfbits | wosize |cust|nofp|noptr| color | tag | +----------------+--------+----+----+-----+-------+------+ bits 63 40 39 21 20 19 18 17 16 15 0 - noptr: a structure with no pointers. All pfbits are used for floats, with a 1 signifying a float and a 0 signifying a non-float. - nofp: a structure with no floats. All pfbits are used for pointers, with a 1 signifying a pointer and a 0 signifying a value. - If both noptr and nofp are set, wosize is extended to include the pfbits. - cust(om): uses custom double word. Custom double word is normally used to indicate more floats and pointers, but functionality can change with certain tags. - If the custom bit is set, wosize is expanded to include the pfbits in the main header. - pfbits: indicates which double words are floats and pointers. Starting at the highest bit: - a 0 indicates neither a pointer nor a float - a 10 indicates a float (double) - a 11 indicates a pointer - If noptr is set, each bit indicates a float. If nofp is set, each bit indicates a pointer. 64 bit custom header (default usage indicated - present only if cust is 1): +--------------------------------------------------------+ | pfbits | +--------------------------------------------------------+ bits 63 0 - pfbits: indicates which double words are floats and pointers. Starting at the highest bit: - a 0 indicates neither a pointer nor a float - a 10 indicates a float (double) - a 11 indicates a pointer - If noptr is set, each bit indicates a float. If nofp is set, each bit indicates a pointer. + Tags: - I think it's a good idea to move custom tags to the low end of the spectrum, and add more there if any are needed. This way, if the tag field is ever expanded, it's not necessary to move the custom tags again. - 0: Closure tag - 1: Infix tag (must be 1 mod 4) - 2: Lazy tag - 3: Object tag - 4: Forward tag - 5: Abstract tag - 6: String tag - 7: Double tag - 8: Custom tag - 9: Proposed tag: custom array. Half of custom header is used to indicate array member size, so one could have an array of tuples, saving both memory and indirections. - 100: Start of user tags Yotam Barnoy [-- Attachment #2: Type: text/html, Size: 5526 bytes --] ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Caml-list] Proposal: re-design of ocaml headers 2013-09-27 14:05 [Caml-list] Proposal: re-design of ocaml headers Yotam Barnoy @ 2013-09-27 15:08 ` Dmitry Grebeniuk [not found] ` <CAN6ygOmuCX6HLfSns0tXQCF3LWMANqhpnSN0vGWcNg0one2QzQ@mail.gmail.com> 2013-09-27 15:31 ` [Caml-list] " Anthony Tavener 2013-09-30 14:48 ` Goswin von Brederlow 2013-10-06 10:39 ` Florian Weimer 2 siblings, 2 replies; 18+ messages in thread From: Dmitry Grebeniuk @ 2013-09-27 15:08 UTC (permalink / raw) To: Yotam Barnoy Hello. Can you please share your experience of writing bindings to some C libraries for <your preferable language>? How it compares to writing bindings for OCaml? (this is a thread about runtime values representation, I suppose.) Why will anyone ever need more than 200 constructors of a sum type? (also note the presence of polymorphic variant types.) > random access to this data is never needed mkay... :[ ^ permalink raw reply [flat|nested] 18+ messages in thread
[parent not found: <CAN6ygOmuCX6HLfSns0tXQCF3LWMANqhpnSN0vGWcNg0one2QzQ@mail.gmail.com>]
* [Caml-list] Fwd: Proposal: re-design of ocaml headers [not found] ` <CAN6ygOmuCX6HLfSns0tXQCF3LWMANqhpnSN0vGWcNg0one2QzQ@mail.gmail.com> @ 2013-09-27 15:25 ` Yotam Barnoy 2013-09-27 16:20 ` Dmitry Grebeniuk 0 siblings, 1 reply; 18+ messages in thread From: Yotam Barnoy @ 2013-09-27 15:25 UTC (permalink / raw) To: Ocaml Mailing List [-- Attachment #1: Type: text/plain, Size: 1748 bytes --] > > Can you please share your experience of writing bindings to some C > libraries for <your preferable language>? How it compares to writing > bindings for OCaml? (this is a thread about runtime values > representation, I suppose.) > > This isn't really relevant to this topic, since this discussion is just about ocaml headers, rather than the ocaml C FFI. The FFI would remain largely the same. > Why will anyone ever need more than 200 constructors of a sum type? > (also note the presence of polymorphic variant types.) > I thought so too, but as mentioned in the previous discussion thread (entitled 'Expanding the Float Array Tag'), code generators can easily run out of 200 constructors. However, it's possible that 65,000 constructors is really excessive, in which case some bits can be shifted around. I'd be happy to get some feedback on this. > > random access to this data is never needed > > mkay... :[ > This is only talking about automatic ocaml functions, which work without any type information. Proper ocaml code is typed, so it doesn't need this information. But internal mechanisms such as the GC, marshal module, and polymorphic comparison are 'dumb' in the sense that they're missing all of the crucial type information. For these functions, there needs to be some basic type information within the data structure itself. Currently, floats are stored in their own independent data structure with a 'double_tag' header, or otherwise they're in a 'double_array'. Having some information in the header about internal floats and pointers is much more efficient, but it doesn't need to have random access -- the internal ocaml functions that use this information process the entire data structure at once. Yotam [-- Attachment #2: Type: text/html, Size: 2693 bytes --] ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Caml-list] Fwd: Proposal: re-design of ocaml headers 2013-09-27 15:25 ` [Caml-list] Fwd: " Yotam Barnoy @ 2013-09-27 16:20 ` Dmitry Grebeniuk 2013-09-27 18:08 ` Yotam Barnoy 0 siblings, 1 reply; 18+ messages in thread From: Dmitry Grebeniuk @ 2013-09-27 16:20 UTC (permalink / raw) To: Yotam Barnoy Hello. >> (this is a thread about runtime values >> representation, I suppose.) > This isn't really relevant to this topic, since this discussion is just > about ocaml headers, rather than the ocaml C FFI. The FFI would remain > largely the same. There is a lot of bindings have to be rewritten due to these changes. You can not automate it with C preprocessor. What would you suggest here? ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Caml-list] Fwd: Proposal: re-design of ocaml headers 2013-09-27 16:20 ` Dmitry Grebeniuk @ 2013-09-27 18:08 ` Yotam Barnoy 2013-09-27 18:12 ` Yotam Barnoy 2013-09-27 18:15 ` Paolo Donadeo 0 siblings, 2 replies; 18+ messages in thread From: Yotam Barnoy @ 2013-09-27 18:08 UTC (permalink / raw) To: Dmitry Grebeniuk, Ocaml Mailing List [-- Attachment #1: Type: text/plain, Size: 1258 bytes --] This is a good point. I'm not that familiar with the bindings in ocaml. What I can say is that bindings should be written at an abstract enough level that they don't mess directly with internal bit representations. Using the ctypes library seems like a good way to go. If a bunch of libraries' bindings have to be rewritten because the internal runtime representation has changed... well then maybe that should be done once to allow them to be abstract enough so that future internal changes don't have the same impact. Yotam On Fri, Sep 27, 2013 at 12:20 PM, Dmitry Grebeniuk <gdsfh1@gmail.com> wrote: > Hello. > > >> (this is a thread about runtime values > >> representation, I suppose.) > > This isn't really relevant to this topic, since this discussion is just > > about ocaml headers, rather than the ocaml C FFI. The FFI would remain > > largely the same. > > There is a lot of bindings have to be rewritten due to these > changes. You can not automate it with C preprocessor. What would you > suggest here? > > -- > Caml-list mailing list. Subscription management and archives: > https://sympa.inria.fr/sympa/arc/caml-list > Beginner's list: http://groups.yahoo.com/group/ocaml_beginners > Bug reports: http://caml.inria.fr/bin/caml-bugs > [-- Attachment #2: Type: text/html, Size: 1981 bytes --] ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Caml-list] Fwd: Proposal: re-design of ocaml headers 2013-09-27 18:08 ` Yotam Barnoy @ 2013-09-27 18:12 ` Yotam Barnoy 2013-09-27 18:15 ` Paolo Donadeo 1 sibling, 0 replies; 18+ messages in thread From: Yotam Barnoy @ 2013-09-27 18:12 UTC (permalink / raw) To: Dmitry Grebeniuk, Ocaml Mailing List [-- Attachment #1: Type: text/plain, Size: 5534 bytes --] I updated the definitions by reducing the number of bits for the tag to 13 (allowing up to 8000 tags, which definitely seems like it should be enough). I also forgot to insert the 0 tag, which is for arrays, tuples, records etc. So that's included now: + For 16-bit and 32-bit architectures: +---------------+----+----+-----+-------+------+ | wosize | ext|cust|noptr| color | tag | +---------------+----+----+-----+-------+------+ bits 31 18 17 16 15 14 13 12 0 - noptr: no pointers present - ext: uses extension word - cust(om): uses custom word. Custom word is normally used to indicate floats and pointers. - If both ext and cust are on, the extension word precedes the custom word in memory. 32 bit extension word (present only if ext is 1) +---------------------------------------------+ | wosize | +---------------------------------------------+ bits 31 0 32 bit custom word (default usage - present only if cust is 1): +----+----------------------------------------+ |nofp| pfbits | +----+----------------------------------------+ bits 31 30 0 - pfbits: indicates which double words are floats and pointers. Starting at the highest bit: - a 0 indicates neither a pointer nor a float - a 10 indicates a float (double) - a 11 indicates a pointer - If noptr is set, each bit indicates a float. If nofp is set, each bit indicates a pointer. + For 64-bit architectures: +----------------+--------+----+----+-----+-------+------+ | pfbits | wosize |cust|nofp|noptr| color | tag | +----------------+--------+----+----+-----+-------+------+ bits 63 37 36 18 17 16 15 14 13 12 0 - noptr: a structure with no pointers. All pfbits are used for floats, with a 1 signifying a float and a 0 signifying a non-float. - nofp: a structure with no floats. All pfbits are used for pointers, with a 1 signifying a pointer and a 0 signifying a value. - If both noptr and nofp are set, wosize is extended to include the pfbits. It does NOT mean that the structure has no floats and no pointers, only that the pfbits field is unused. - cust(om): uses custom double word. Custom double word is normally used to indicate more floats and pointers, but functionality can change with certain tags. - If the custom bit is set, wosize is expanded to include the pfbits in the main header. - pfbits: indicates which double words are floats and pointers. Starting at the highest bit: - a 0 indicates neither a pointer nor a float - a 10 indicates a float (double) - a 11 indicates a pointer - If noptr is set, each bit indicates a float. If nofp is set, each bit indicates a pointer. 64 bit custom header (default usage indicated - present only if cust is 1): +--------------------------------------------------------+ | pfbits | +--------------------------------------------------------+ bits 63 0 - pfbits: indicates which double words are floats and pointers. Starting at the highest bit: - a 0 indicates neither a pointer nor a float - a 10 indicates a float (double) - a 11 indicates a pointer - If noptr is set, each bit indicates a float. If nofp is set, each bit indicates a pointer. + Tags: - I think in general it's a good idea to move proprietary tags to the low end of the spectrum, and add more there if any are needed. This way, if the tag field is ever expanded, it's not necessary to move these tags again. - 0: Array, record, tuple tag - 1: Infix tag (must be 1 mod 4) - 2: Closure tag - 3: Lazy tag - 4: Object tag - 5: Forward tag - 6: Abstract tag - 7: String tag - 8: Double tag - 9: Custom tag - 10: Proposed tag: custom array. Half of custom header is used to indicate array member size, so one could have an array of tuples, saving both memory and indirections. - 100: Start of user tags -Yotam On Fri, Sep 27, 2013 at 2:08 PM, Yotam Barnoy <yotambarnoy@gmail.com> wrote: > This is a good point. I'm not that familiar with the bindings in ocaml. > What I can say is that bindings should be written at an abstract enough > level that they don't mess directly with internal bit representations. > Using the ctypes library seems like a good way to go. If a bunch of > libraries' bindings have to be rewritten because the internal runtime > representation has changed... well then maybe that should be done once to > allow them to be abstract enough so that future internal changes don't have > the same impact. > > Yotam > > > On Fri, Sep 27, 2013 at 12:20 PM, Dmitry Grebeniuk <gdsfh1@gmail.com>wrote: > >> Hello. >> >> >> (this is a thread about runtime values >> >> representation, I suppose.) >> > This isn't really relevant to this topic, since this discussion is just >> > about ocaml headers, rather than the ocaml C FFI. The FFI would remain >> > largely the same. >> >> There is a lot of bindings have to be rewritten due to these >> changes. You can not automate it with C preprocessor. What would you >> suggest here? >> >> -- >> Caml-list mailing list. Subscription management and archives: >> https://sympa.inria.fr/sympa/arc/caml-list >> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners >> Bug reports: http://caml.inria.fr/bin/caml-bugs >> > > [-- Attachment #2: Type: text/html, Size: 7220 bytes --] ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Caml-list] Fwd: Proposal: re-design of ocaml headers 2013-09-27 18:08 ` Yotam Barnoy 2013-09-27 18:12 ` Yotam Barnoy @ 2013-09-27 18:15 ` Paolo Donadeo 2013-09-27 18:41 ` Yotam Barnoy 1 sibling, 1 reply; 18+ messages in thread From: Paolo Donadeo @ 2013-09-27 18:15 UTC (permalink / raw) To: OCaml mailing list On Fri, Sep 27, 2013 at 8:08 PM, Yotam Barnoy <yotambarnoy@gmail.com> wrote: > What I can say is that bindings should be written at an abstract enough > level that they don't mess directly with internal bit representations. Many bindings deal with memory allocation. The binding of Lua deeply interacts with the garbage collector and inspects the actual type (tag) of OCaml value passed. In principle you are right but the reality is that there is no much "abstraction" in C :-) -- Paolo ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Caml-list] Fwd: Proposal: re-design of ocaml headers 2013-09-27 18:15 ` Paolo Donadeo @ 2013-09-27 18:41 ` Yotam Barnoy 0 siblings, 0 replies; 18+ messages in thread From: Yotam Barnoy @ 2013-09-27 18:41 UTC (permalink / raw) To: Paolo Donadeo; +Cc: OCaml mailing list [-- Attachment #1: Type: text/plain, Size: 1067 bytes --] On Fri, Sep 27, 2013 at 2:15 PM, Paolo Donadeo <p.donadeo@gmail.com> wrote: > Many bindings deal with memory allocation. The binding of Lua deeply > interacts with the garbage collector and inspects the actual type > (tag) of OCaml value passed. > > In principle you are right but the reality is that there is no much > "abstraction" in C :-) > > It's true that coming from the C side, abstraction is a problem. However, the kind of bindings you're describing (Lua) essentially inhibit any improvement to the GC (such as parallelization, which is bound to happen some day) or to the runtime. It seems like it would be much better if bindings from C (Lua, for example) called more general functions or bindings present in the runtime's C code aka an API. Given that it seems like some bindings currently don't use an API but inspect values closely, I don't see any solution other than adapting the things that require changes. Hopefully the majority of C bindings (at least from the ocaml side) can be moved to ctypes, and can therefore be made more generic. -Yotam [-- Attachment #2: Type: text/html, Size: 1520 bytes --] ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Caml-list] Proposal: re-design of ocaml headers 2013-09-27 15:08 ` Dmitry Grebeniuk [not found] ` <CAN6ygOmuCX6HLfSns0tXQCF3LWMANqhpnSN0vGWcNg0one2QzQ@mail.gmail.com> @ 2013-09-27 15:31 ` Anthony Tavener 2013-09-27 15:37 ` Yotam Barnoy 2013-09-27 16:50 ` Dmitry Grebeniuk 1 sibling, 2 replies; 18+ messages in thread From: Anthony Tavener @ 2013-09-27 15:31 UTC (permalink / raw) To: Dmitry Grebeniuk; +Cc: Yotam Barnoy [-- Attachment #1: Type: text/plain, Size: 794 bytes --] On Fri, Sep 27, 2013 at 9:08 AM, Dmitry Grebeniuk <gdsfh1@gmail.com> wrote: > > Why will anyone ever need more than 200 constructors of a sum type? > (also note the presence of polymorphic variant types.) > > Back when I read about the limit on constructors I had a moment of worry. Thankfully the limit is only on non-constant constructors. I currently have a variant with 292 constructors, but only 30 are non-constant. These represent optional characteristics a character may have in a game, some with additional payload. I could imagine all of them having their own payload, but of course there are other options, like polymorphic variants, or encoding these purely as data rather than types. I wanted to share that as a "be careful of what seems impossible from your perspective". ;) [-- Attachment #2: Type: text/html, Size: 1333 bytes --] ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Caml-list] Proposal: re-design of ocaml headers 2013-09-27 15:31 ` [Caml-list] " Anthony Tavener @ 2013-09-27 15:37 ` Yotam Barnoy 2013-09-27 16:50 ` Dmitry Grebeniuk 1 sibling, 0 replies; 18+ messages in thread From: Yotam Barnoy @ 2013-09-27 15:37 UTC (permalink / raw) To: Anthony Tavener; +Cc: Dmitry Grebeniuk, Yotam Barnoy [-- Attachment #1: Type: text/plain, Size: 1166 bytes --] On Fri, Sep 27, 2013 at 11:31 AM, Anthony Tavener <anthony.tavener@gmail.com > wrote: > > On Fri, Sep 27, 2013 at 9:08 AM, Dmitry Grebeniuk <gdsfh1@gmail.com>wrote: > >> >> Why will anyone ever need more than 200 constructors of a sum type? >> (also note the presence of polymorphic variant types.) >> >> > Back when I read about the limit on constructors I had a moment of worry. > Thankfully the limit is only on non-constant constructors. > > I currently have a variant with 292 constructors, but only 30 are > non-constant. > These represent optional characteristics a character may have in a game, > some > with additional payload. I could imagine all of them having their own > payload, > but of course there are other options, like polymorphic variants, or > encoding these > purely as data rather than types. > > I wanted to share that as a "be careful of what seems impossible from your > perspective". ;) > Right. Constant constructors are represented as integers and require no header, so they're very cheap. And polymorphic variants use more memory, so they're not a great option. Having the flexibility for more constructors is a good thing. Yotam [-- Attachment #2: Type: text/html, Size: 2041 bytes --] ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Caml-list] Proposal: re-design of ocaml headers 2013-09-27 15:31 ` [Caml-list] " Anthony Tavener 2013-09-27 15:37 ` Yotam Barnoy @ 2013-09-27 16:50 ` Dmitry Grebeniuk 1 sibling, 0 replies; 18+ messages in thread From: Dmitry Grebeniuk @ 2013-09-27 16:50 UTC (permalink / raw) To: Anthony Tavener Hello. (first, personal: nice to see someone developing games in OCaml! Wish you good profit here. Btw, I'd like to see an announcement of your game here.) > I currently have a variant with 292 constructors, but only 30 are > non-constant. > These represent optional characteristics a character may have in a game, > some > with additional payload. Things I can think of: - Are these characteristics "flat", ungroupped, without any common logic like "health-affecting", "strength-affecting"? Maybe it would be nice (w.r.t. the logic) to group them? - Sum types should be matched fully, each constructor should be present in "match .. with". That's a point of sum types and guarantees about them (there's a warning about it in the compiler, great thing). I doubt you usually have such heavy matches. So if you don't make use of such kind of guarantees, you can use even exceptions to encode characteristics. (polymorphic variants are fine too, of course.) > I wanted to share that as a "be careful of what seems impossible from your > perspective". ;) Got it. But any language has some limits of what can one achieve with it "out of box and using common ways". There are other ways to make things work. So one of my mottos is "keep caml and curry on". ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Caml-list] Proposal: re-design of ocaml headers 2013-09-27 14:05 [Caml-list] Proposal: re-design of ocaml headers Yotam Barnoy 2013-09-27 15:08 ` Dmitry Grebeniuk @ 2013-09-30 14:48 ` Goswin von Brederlow 2013-09-30 15:31 ` Yotam Barnoy 2013-10-06 10:39 ` Florian Weimer 2 siblings, 1 reply; 18+ messages in thread From: Goswin von Brederlow @ 2013-09-30 14:48 UTC (permalink / raw) To: caml-list On Fri, Sep 27, 2013 at 10:05:56AM -0400, Yotam Barnoy wrote: > Following up on the previous thread I started in this general topic, I > present some more thinking I've done on the redesign of ocaml's headers. > The purpose of this redesign is to lift the tag number restriction as well > as size limits on 32-bit platforms. At the same time, header bits can be > used to indicate floats, allowing cheaper usage of floats in data > structures, and even to indicate the presence of pointers, making the > traversal of some data structures by the GC unnecessary. > > The basic idea of this redesign is that most allocations need only a small > amount of space, but a large number of tags is a necessity. If you're > allocating a large block of memory (>8KB) then you can spare another word > for the size. > > The pfbits (as shown below) are an efficient way of representing both > floats and pointers in a data structure, at the cost of disallowing random > access. From what I can gather, random access to this data is never needed, > since the GC, Marshal module, and polymorphic comparison all process the > whole data structure rather than referring to specific parts of it. > > Open issue: making float representation efficient on the stack. > > > + For 16-bit and 32-bit architectures: > +---------------+----+----+-----+-------+------+ > | wosize | ext|cust|noptr| color | tag | > +---------------+----+----+-----+-------+------+ > bits 31 21 20 19 18 17 16 15 0 > > - noptr: no pointers present > - ext: uses extension word > - cust(om): uses custom word. Custom word is normally used to indicate > floats and pointers. > > 32 bit extension word (present only if ext is 1) > +---------------------------------------------+ > | wosize | > +---------------------------------------------+ > bits 31 0 Why use a full bit for ext? I would define wosize == 0 to mean an extension word with the actual size is present. That way sizes up to <16KB can be encoded without extension word. > 32 bit custom word (default usage - present only if cust is 1): > +----+----------------------------------------+ > |nofp| pfbits | > +----+----------------------------------------+ > bits 31 30 0 > > - nofp: a structure with no floats. All pfbits are used for pointers, with > a 1 signifying a pointer and a 0 signifying a value. > - pfbits: indicates which double words are floats and pointers. Starting at > the highest bit: > - a 0 indicates neither a pointer nor a float > - a 10 indicates a float (double) > - a 11 indicates a pointer > - If noptr is set, each bit indicates a float. If nofp is set, each bit > indicates a pointer. There are 3 kinds of values: 1) pointers with bit 0 == 0 2) non-pointers with bit 0 == 1 3) floats with all bits used for the type (spanning 2 fields in 32bit) So if pfbits indicates a float then a field (or 2) is a float and all bits are used for the value. Otherwise the bit 0 of the field will tell you wether it is a pointer or not. So why would you want to duplicate that information in the pfbits? Or do you want to store untagged/unboxed values in blocks? That also means that the nofp bit is pointless. If there are no floats then no custom word is required. And if no custom word is given then the block must not have any floats. I do love the noptr bit though. Blocks without pointers are quite common and int/bool/char arrays will hugely benefit from that. > + For 64-bit architectures: > > +----------------+--------+----+----+-----+-------+------+ > | pfbits | wosize |cust|nofp|noptr| color | tag | > +----------------+--------+----+----+-----+-------+------+ > bits 63 40 39 21 20 19 18 17 16 15 0 Isn't wosize a bit small there? That limits blocks to 4MiB total, right? > - noptr: a structure with no pointers. All pfbits are used for floats, with > a 1 signifying a float and a 0 signifying a non-float. > - nofp: a structure with no floats. All pfbits are used for pointers, with > a 1 signifying a pointer and a 0 signifying a value. > - If both noptr and nofp are set, wosize is extended to include the pfbits. > - cust(om): uses custom double word. Custom double word is normally used to > indicate more floats and pointers, but functionality can change with > certain tags. > - If the custom bit is set, wosize is expanded to include the pfbits in > the main header. > - pfbits: indicates which double words are floats and pointers. Starting at > the highest bit: > - a 0 indicates neither a pointer nor a float > - a 10 indicates a float (double) > - a 11 indicates a pointer > - If noptr is set, each bit indicates a float. If nofp is set, each bit > indicates a pointer. As above pointer should not be in the pfbits. So if nofp is set then wosize would be extended to include pfbits. Same if Double tag is set. That would allow for larger arrays without floats or only floats. I guess blocks with a mix of float and non-floats will not be large. Even for generated code I don't expect constructors or recrods with more than 500k labels. > 64 bit custom header (default usage indicated - present only if cust is 1): > +--------------------------------------------------------+ > | pfbits | > +--------------------------------------------------------+ > bits 63 0 > > - pfbits: indicates which double words are floats and pointers. Starting at > the highest bit: > - a 0 indicates neither a pointer nor a float > - a 10 indicates a float (double) > - a 11 indicates a pointer > - If noptr is set, each bit indicates a float. If nofp is set, each bit > indicates a pointer. > > + Tags: > - I think it's a good idea to move custom tags to the low end of the > spectrum, and add more there if any are needed. This way, if the tag field > is ever expanded, it's not necessary to move the custom tags again. > > - 0: Closure tag > - 1: Infix tag (must be 1 mod 4) > - 2: Lazy tag > - 3: Object tag > - 4: Forward tag > - 5: Abstract tag > - 6: String tag > - 7: Double tag > - 8: Custom tag > - 9: Proposed tag: custom array. Half of custom header is used to indicate > array member size, so one could have an array of tuples, saving both memory > and indirections. > - 100: Start of user tags > > Yotam Barnoy It might be nice to support C values like untagged ints or unaligned pointers. If Custom tag is set then the pfbits become ocaml value bits. The GC will only inspect fields with pfbit set. All other fields are ignored. The custom_operations handle compare, hash, serialize and deserialize so nothing else will access the data. Another thing are int32 and int64. I guess if you want to unbox those then having 2 bits per field in pfbits makes sense again. But then I would allocate them as: - a 00 indicates a tagged value (int or pointer) - a 01 indicates a non-pointer: int, int32, native int, C pointer - a 10 indicates a float (double) - a 11 indicates an int64 The higher bit would indicate a 64bit value, meaning spanning 2 fields on 32bit. Not that those 4 values allow mixing ocaml values, C values, int32, int64 and float in a block. I would combine the noptr and nofp bits into a single 2bit field: - a 00 indicates no pointers and no double size, no pfbits - a 01 indicates no double size, pfbits indicate tagged / non-pointer - a 10 indicates no pointers but double size, pfbits indicate size - a 11 indicates both pointers and double size, 2 pfbits per field Note: tagged integers can be stored as 00 or 01. I think this would be required for polymorphic types. An 'a could be int or pointer. In both cases 00 will work. MfG Goswin ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Caml-list] Proposal: re-design of ocaml headers 2013-09-30 14:48 ` Goswin von Brederlow @ 2013-09-30 15:31 ` Yotam Barnoy 2013-10-08 10:52 ` Goswin von Brederlow 0 siblings, 1 reply; 18+ messages in thread From: Yotam Barnoy @ 2013-09-30 15:31 UTC (permalink / raw) To: Goswin von Brederlow; +Cc: Ocaml Mailing List [-- Attachment #1: Type: text/plain, Size: 4829 bytes --] On Mon, Sep 30, 2013 at 10:48 AM, Goswin von Brederlow <goswin-v-b@web.de>wrote: > > > > + For 16-bit and 32-bit architectures: > > +---------------+----+----+-----+-------+------+ > > | wosize | ext|cust|noptr| color | tag | > > +---------------+----+----+-----+-------+------+ > > bits 31 21 20 19 18 17 16 15 0 > > > > - noptr: no pointers present > > - ext: uses extension word > > - cust(om): uses custom word. Custom word is normally used to indicate > > floats and pointers. > > > > 32 bit extension word (present only if ext is 1) > > +---------------------------------------------+ > > | wosize | > > +---------------------------------------------+ > > bits 31 0 > > Why use a full bit for ext? I would define wosize == 0 to mean an > extension word with the actual size is present. That way sizes up to > <16KB can be encoded without extension word. > > Great point! Of course, that makes perfect sense. I was feeling like I was wasting the wosize bits with the extension word but couldn't quite get put 2 and 2 together. BTW, down the thread is a newer version of the design that reduces the tag space to 8000 tags, which I do think is sufficient. > > 32 bit custom word (default usage - present only if cust is 1): > > +----+----------------------------------------+ > > |nofp| pfbits | > > +----+----------------------------------------+ > > bits 31 30 0 > > > > - nofp: a structure with no floats. All pfbits are used for pointers, > with > > a 1 signifying a pointer and a 0 signifying a value. > > - pfbits: indicates which double words are floats and pointers. Starting > at > > the highest bit: > > - a 0 indicates neither a pointer nor a float > > - a 10 indicates a float (double) > > - a 11 indicates a pointer > > - If noptr is set, each bit indicates a float. If nofp is set, each > bit > > indicates a pointer. > > There are 3 kinds of values: > > 1) pointers with bit 0 == 0 > 2) non-pointers with bit 0 == 1 > 3) floats with all bits used for the type (spanning 2 fields in 32bit) > > So if pfbits indicates a float then a field (or 2) is a float and all > bits are used for the value. Otherwise the bit 0 of the field will > tell you wether it is a pointer or not. So why would you want to > duplicate that information in the pfbits? > I was thinking of doing it for efficiency. If we're already indicating what's what, we might as well represent shortcuts to the pointers, which would cut down on the amount of reading, no? In the average case, the GC would need to access a lot less memory. > It might be nice to support C values like untagged ints or unaligned > pointers. If Custom tag is set then the pfbits become ocaml value > bits. The GC will only inspect fields with pfbit set. All other fields > are ignored. The custom_operations handle compare, hash, serialize and > deserialize so nothing else will access the data. > > Another thing are int32 and int64. I guess if you want to unbox those > then having 2 bits per field in pfbits makes sense again. But then I > would allocate them as: > > - a 00 indicates a tagged value (int or pointer) > - a 01 indicates a non-pointer: int, int32, native int, C pointer > - a 10 indicates a float (double) > - a 11 indicates an int64 > > The higher bit would indicate a 64bit value, meaning spanning 2 fields > on 32bit. Not that those 4 values allow mixing ocaml values, C values, > int32, int64 and float in a block. > > I would combine the noptr and nofp bits into a single 2bit field: > > - a 00 indicates no pointers and no double size, no pfbits > - a 01 indicates no double size, pfbits indicate tagged / non-pointer > - a 10 indicates no pointers but double size, pfbits indicate size > - a 11 indicates both pointers and double size, 2 pfbits per field > > Note: tagged integers can be stored as 00 or 01. I think this would be > required for polymorphic types. An 'a could be int or pointer. In both > cases 00 will work. > > I really like this idea -- unboxing more types could be really useful. I'm not sure double 'size' would work, however. It should be fine for the marshal module, but polymorphic comparison would get messed up because floats have to be compared differently. So I think 10 in the bit field should indicate no pointers but floats, while 11 could allow both pointers and double size, with the 2-bits specifying if it's a float or an int64 (as you've outlined). Of course, one cannot have both shortcuts to pointers and enhanced unboxing, so let me know what you think about the performance increase from shortcutting the tag bit. Yotam [-- Attachment #2: Type: text/html, Size: 5931 bytes --] ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Caml-list] Proposal: re-design of ocaml headers 2013-09-30 15:31 ` Yotam Barnoy @ 2013-10-08 10:52 ` Goswin von Brederlow 2013-10-11 15:48 ` Yotam Barnoy 0 siblings, 1 reply; 18+ messages in thread From: Goswin von Brederlow @ 2013-10-08 10:52 UTC (permalink / raw) To: Yotam Barnoy; +Cc: Ocaml Mailing List On Mon, Sep 30, 2013 at 11:31:23AM -0400, Yotam Barnoy wrote: > On Mon, Sep 30, 2013 at 10:48 AM, Goswin von Brederlow <goswin-v-b@web.de>wrote: > > > > > > > + For 16-bit and 32-bit architectures: > > > +---------------+----+----+-----+-------+------+ > > > | wosize | ext|cust|noptr| color | tag | > > > +---------------+----+----+-----+-------+------+ > > > bits 31 21 20 19 18 17 16 15 0 > > > > > > - noptr: no pointers present > > > - ext: uses extension word > > > - cust(om): uses custom word. Custom word is normally used to indicate > > > floats and pointers. > > > > > > 32 bit extension word (present only if ext is 1) > > > +---------------------------------------------+ > > > | wosize | > > > +---------------------------------------------+ > > > bits 31 0 > > > > Why use a full bit for ext? I would define wosize == 0 to mean an > > extension word with the actual size is present. That way sizes up to > > <16KB can be encoded without extension word. > > > > > Great point! Of course, that makes perfect sense. I was feeling like I was > wasting the wosize bits with the extension word but couldn't quite get put > 2 and 2 together. > BTW, down the thread is a newer version of the design that reduces the tag > space to 8000 tags, which I do think is sufficient. > > > > > > 32 bit custom word (default usage - present only if cust is 1): > > > +----+----------------------------------------+ > > > |nofp| pfbits | > > > +----+----------------------------------------+ > > > bits 31 30 0 > > > > > > - nofp: a structure with no floats. All pfbits are used for pointers, > > with > > > a 1 signifying a pointer and a 0 signifying a value. > > > - pfbits: indicates which double words are floats and pointers. Starting > > at > > > the highest bit: > > > - a 0 indicates neither a pointer nor a float > > > - a 10 indicates a float (double) > > > - a 11 indicates a pointer > > > - If noptr is set, each bit indicates a float. If nofp is set, each > > bit > > > indicates a pointer. > > > > There are 3 kinds of values: > > > > 1) pointers with bit 0 == 0 > > 2) non-pointers with bit 0 == 1 > > 3) floats with all bits used for the type (spanning 2 fields in 32bit) > > > > So if pfbits indicates a float then a field (or 2) is a float and all > > bits are used for the value. Otherwise the bit 0 of the field will > > tell you wether it is a pointer or not. So why would you want to > > duplicate that information in the pfbits? > > > > I was thinking of doing it for efficiency. If we're already indicating > what's what, we might as well represent shortcuts to the pointers, which > would cut down on the amount of reading, no? In the average case, the GC > would need to access a lot less memory. > > > > It might be nice to support C values like untagged ints or unaligned > > pointers. If Custom tag is set then the pfbits become ocaml value > > bits. The GC will only inspect fields with pfbit set. All other fields > > are ignored. The custom_operations handle compare, hash, serialize and > > deserialize so nothing else will access the data. > > > > Another thing are int32 and int64. I guess if you want to unbox those > > then having 2 bits per field in pfbits makes sense again. But then I > > would allocate them as: > > > > - a 00 indicates a tagged value (int or pointer) > > - a 01 indicates a non-pointer: int, int32, native int, C pointer > > - a 10 indicates a float (double) > > - a 11 indicates an int64 > > > > The higher bit would indicate a 64bit value, meaning spanning 2 fields > > on 32bit. Not that those 4 values allow mixing ocaml values, C values, > > int32, int64 and float in a block. > > > > I would combine the noptr and nofp bits into a single 2bit field: > > > > - a 00 indicates no pointers and no double size, no pfbits > > - a 01 indicates no double size, pfbits indicate tagged / non-pointer > > - a 10 indicates no pointers but double size, pfbits indicate size > > - a 11 indicates both pointers and double size, 2 pfbits per field > > > > Note: tagged integers can be stored as 00 or 01. I think this would be > > required for polymorphic types. An 'a could be int or pointer. In both > > cases 00 will work. > > > > > I really like this idea -- unboxing more types could be really useful. I'm > not sure double 'size' would work, however. It should be fine for the > marshal module, but polymorphic comparison would get messed up because > floats have to be compared differently. So I think 10 in the bit field > should indicate no pointers but floats, while 11 could allow both pointers > and double size, with the 2-bits specifying if it's a float or an int64 (as > you've outlined). Of course, one cannot have both shortcuts to pointers and > enhanced unboxing, so let me know what you think about the performance > increase from shortcutting the tag bit. > > Yotam Lets look at an example: type 'a r = { a:int; b:float; c:int32; d:int64; e:'a; } For 16-bit and 32-bit architectures: +--------------------+----------+-------+------+ | wosize |pfbit type| color | tag | +--------------------+----------+-------+------+ bits 31 20 19 18 17 16 15 0 wosize = 7 pfbit type = 11 (pointers and double size) +------------------------------+--+--+--+--+--+ | pfbits |00|11|01|10|01| +------------------------------+--+--+--+--+--+ e d c b a The GC only needs to check e since 'a might be a pointer. All other fields are marked as non pointer. Comparison does a plain bit comparison on a, c and d, a float comparison on b and a tagged comparison on e. Similar for marshaling. There is no confusion between int64 and floats. MfG Goswin ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Caml-list] Proposal: re-design of ocaml headers 2013-10-08 10:52 ` Goswin von Brederlow @ 2013-10-11 15:48 ` Yotam Barnoy 2014-01-30 20:53 ` Yotam Barnoy 2014-02-01 15:27 ` Goswin von Brederlow 0 siblings, 2 replies; 18+ messages in thread From: Yotam Barnoy @ 2013-10-11 15:48 UTC (permalink / raw) To: Goswin von Brederlow; +Cc: Ocaml Mailing List [-- Attachment #1: Type: text/plain, Size: 10282 bytes --] I had an idea based on (or actually copied from) something Goswin mentioned earlier in our discussion. What if the bits can be used to indicate embedded values? An embedded value would have a header inside the body of the parent value, making it possible to get rid of all pointers within that parent. This would represent a potentially large saving for the GC, as well as reducing random jumps around memory which are very cache-stressing. In the spirit of this idea, here is my latest version: + For 16-bit and 32-bit architectures: +---------------+-------------+-----+-------+------+ | wosize |pfbits type |noptr| color | tag | +---------------+-------------+-----+-------+------+ bits 31 19 18 17 16 15 14 13 12 0 pfbits type: - 000: no pfbits - 001: pfbits indicate embedded header - 010: pfbits indicate float - 011: pfbits indicate untagged type - 100: pfbits indicate int64 - 101: pfbits indicate float/embdedded header/untagged type, 2 bits per field - 110: pfbits indicate float/untagged type/int64, 2 bits per field - 111: pfbits indicate embedded header/untagged type/int64, 2 bits per field - noptr: no pointers present - if wosize = 0, the extension word is used for wosize - if both wosize = 0 and the pfbits are used, the wosize_large is first in memory wosize_large word (if wosize is 0 in the header) +---------------------------------------------+ | wosize | +---------------------------------------------+ bits 31 0 32 bit pfbits word (present only if called for by pfbits type in header) +---------------------------------------------+ | pfbits | +---------------------------------------------+ bits 31 30 0 - pfbits: - If working in 1 bit mode, each bit represents whatever is signaled in the pfbits type field. - If working in 2 bit mode, 00 always represents a regular word, and 01, 10, 11 represent their type as signaled in the pfbit type field. + For 64-bit architectures: +-------------+--------+----------+---+------+-------+------+ | pfbits | wosize |pfbit type|exp| noptr| color | tag | +-------------+--------+----------+---+------+-------+------+ bits 63 40 39 20 19 17 16 15 14 13 12 0 - noptr: a structure with no pointers. - pfbits: a small pfbits field for smaller objects - pfbits type: (slightly different than 32-bit architecture) - 000: no pfbits, wosize includes pfbits as its upper bits - 001: pfbits indicate embedded header - 010: pfbits indicate float - 011: pfbits indicate untagged type - 100: pfbits indicate int64 - 101: pfbits indicate float/untagged type/embedded header, 2 bits per field - 110: pfbits indicate float/untagged type/int64, 2 bits per field - 111: pfbits indicate int64/untagged type/embedded header, 2 bits per field - exp: use pfbits_expanded for signaling pfbits. Pfbits in header become top bits of wosize. +--------------------------------------------------------+ | pfbits_expanded | +--------------------------------------------------------+ bits 63 0 - pfbits_expanded: if exp is set, pfbits_expanded takes the place of the pfbits. wo_size is joined with the pfbits in the header. + Tags: - 0: Array, record, tuple tag - 1: Infix tag (must be 1 mod 4) - 2: Closure tag - 3: Lazy tag - 4: Object tag - 5: Forward tag - 6: Abstract tag - 7: String tag - 8: Double tag - 9: Custom tag - 10: Double_array tag - 11: Proposed: Int32_array tag - 12: Proposed: Int64_array tag - 13: Proposed: Cptr_array tag - 14: Proposed: float32_array tag - 1000: first user tag -Yotam On Tue, Oct 8, 2013 at 6:52 AM, Goswin von Brederlow <goswin-v-b@web.de>wrote: > On Mon, Sep 30, 2013 at 11:31:23AM -0400, Yotam Barnoy wrote: > > On Mon, Sep 30, 2013 at 10:48 AM, Goswin von Brederlow < > goswin-v-b@web.de>wrote: > > > > > > > > > > + For 16-bit and 32-bit architectures: > > > > +---------------+----+----+-----+-------+------+ > > > > | wosize | ext|cust|noptr| color | tag | > > > > +---------------+----+----+-----+-------+------+ > > > > bits 31 21 20 19 18 17 16 15 0 > > > > > > > > - noptr: no pointers present > > > > - ext: uses extension word > > > > - cust(om): uses custom word. Custom word is normally used to > indicate > > > > floats and pointers. > > > > > > > > 32 bit extension word (present only if ext is 1) > > > > +---------------------------------------------+ > > > > | wosize | > > > > +---------------------------------------------+ > > > > bits 31 0 > > > > > > Why use a full bit for ext? I would define wosize == 0 to mean an > > > extension word with the actual size is present. That way sizes up to > > > <16KB can be encoded without extension word. > > > > > > > > Great point! Of course, that makes perfect sense. I was feeling like I > was > > wasting the wosize bits with the extension word but couldn't quite get > put > > 2 and 2 together. > > BTW, down the thread is a newer version of the design that reduces the > tag > > space to 8000 tags, which I do think is sufficient. > > > > > > > > > > 32 bit custom word (default usage - present only if cust is 1): > > > > +----+----------------------------------------+ > > > > |nofp| pfbits | > > > > +----+----------------------------------------+ > > > > bits 31 30 0 > > > > > > > > - nofp: a structure with no floats. All pfbits are used for pointers, > > > with > > > > a 1 signifying a pointer and a 0 signifying a value. > > > > - pfbits: indicates which double words are floats and pointers. > Starting > > > at > > > > the highest bit: > > > > - a 0 indicates neither a pointer nor a float > > > > - a 10 indicates a float (double) > > > > - a 11 indicates a pointer > > > > - If noptr is set, each bit indicates a float. If nofp is set, > each > > > bit > > > > indicates a pointer. > > > > > > There are 3 kinds of values: > > > > > > 1) pointers with bit 0 == 0 > > > 2) non-pointers with bit 0 == 1 > > > 3) floats with all bits used for the type (spanning 2 fields in 32bit) > > > > > > So if pfbits indicates a float then a field (or 2) is a float and all > > > bits are used for the value. Otherwise the bit 0 of the field will > > > tell you wether it is a pointer or not. So why would you want to > > > duplicate that information in the pfbits? > > > > > > > I was thinking of doing it for efficiency. If we're already indicating > > what's what, we might as well represent shortcuts to the pointers, which > > would cut down on the amount of reading, no? In the average case, the GC > > would need to access a lot less memory. > > > > > > > It might be nice to support C values like untagged ints or unaligned > > > pointers. If Custom tag is set then the pfbits become ocaml value > > > bits. The GC will only inspect fields with pfbit set. All other fields > > > are ignored. The custom_operations handle compare, hash, serialize and > > > deserialize so nothing else will access the data. > > > > > > Another thing are int32 and int64. I guess if you want to unbox those > > > then having 2 bits per field in pfbits makes sense again. But then I > > > would allocate them as: > > > > > > - a 00 indicates a tagged value (int or pointer) > > > - a 01 indicates a non-pointer: int, int32, native int, C pointer > > > - a 10 indicates a float (double) > > > - a 11 indicates an int64 > > > > > > The higher bit would indicate a 64bit value, meaning spanning 2 fields > > > on 32bit. Not that those 4 values allow mixing ocaml values, C values, > > > int32, int64 and float in a block. > > > > > > I would combine the noptr and nofp bits into a single 2bit field: > > > > > > - a 00 indicates no pointers and no double size, no pfbits > > > - a 01 indicates no double size, pfbits indicate tagged / > non-pointer > > > - a 10 indicates no pointers but double size, pfbits indicate size > > > - a 11 indicates both pointers and double size, 2 pfbits per field > > > > > > Note: tagged integers can be stored as 00 or 01. I think this would be > > > required for polymorphic types. An 'a could be int or pointer. In both > > > cases 00 will work. > > > > > > > > I really like this idea -- unboxing more types could be really useful. > I'm > > not sure double 'size' would work, however. It should be fine for the > > marshal module, but polymorphic comparison would get messed up because > > floats have to be compared differently. So I think 10 in the bit field > > should indicate no pointers but floats, while 11 could allow both > pointers > > and double size, with the 2-bits specifying if it's a float or an int64 > (as > > you've outlined). Of course, one cannot have both shortcuts to pointers > and > > enhanced unboxing, so let me know what you think about the performance > > increase from shortcutting the tag bit. > > > > Yotam > > Lets look at an example: > > type 'a r = { a:int; b:float; c:int32; d:int64; e:'a; } > > For 16-bit and 32-bit architectures: > +--------------------+----------+-------+------+ > | wosize |pfbit type| color | tag | > +--------------------+----------+-------+------+ > bits 31 20 19 18 17 16 15 0 > > wosize = 7 > pfbit type = 11 (pointers and double size) > > +------------------------------+--+--+--+--+--+ > | pfbits |00|11|01|10|01| > +------------------------------+--+--+--+--+--+ > e d c b a > > The GC only needs to check e since 'a might be a pointer. All other fields > are marked as non pointer. > > Comparison does a plain bit comparison on a, c and d, a float > comparison on b and a tagged comparison on e. Similar for marshaling. > There is no confusion between int64 and floats. > > MfG > Goswin > [-- Attachment #2: Type: text/html, Size: 12300 bytes --] ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Caml-list] Proposal: re-design of ocaml headers 2013-10-11 15:48 ` Yotam Barnoy @ 2014-01-30 20:53 ` Yotam Barnoy 2014-02-01 15:27 ` Goswin von Brederlow 1 sibling, 0 replies; 18+ messages in thread From: Yotam Barnoy @ 2014-01-30 20:53 UTC (permalink / raw) To: Goswin von Brederlow; +Cc: Ocaml Mailing List [-- Attachment #1: Type: text/plain, Size: 5185 bytes --] I'm resurrecting this thread. Given the recent discussion about optimization, I've done some more thinking and simplification work on my proposal for a better ocaml header. Much of this is just adopting Goswin's proposal, which made a lot of sense. I tried to also reduce the variations while allowing for a lot of useful unboxing of floats, int64s etc. I've also tried to tackle tuples in this version -- basically, in order to be compatible with polymorphic functions, you really want tuples and all other polymorphic types (as in 'a embedded inside records) to be of uniform size. Expanding that size under some circumstances to 64 bits on 32 bit platforms makes sense if, for example, the tuple contains many floats. I don't think handling both what I call narrow and wide polymorphic types (for lack of a better name) will make the code much more complex -- I certainly don't want it to result in heavy register spilling. Also, the dynamic parts of the header now appear BEFORE the header itself. Their highest bit is 0 to indicate that they're not the main header. This will be seen by GC code, which can afford to do the few extra comparisons. Mutator code will always have the main header available right before the data. So here it is: + For 32-bit architectures: --------------------------- Wide Types ---------- - For polymorphic types (containing 'a), member sizes need to be uniform for speed. Tuples are fully polymorphic. These polymorphic members can either be narrow (32 bit) as they are now or wide (64 bit) to accomodate float/int64 unboxing. +-----------+--------------+------+-----+-------+------+ | 1 | wosize| fbits |ebits |noptr| color | tag | +-----------+--------------+------+-----+-------+------+ bits 31 30 21 20 15 14 13 12 11 10 0 - noptr: no pointers present. - fbits: wide types cannot be represented unless extbits is used - 00: tagged (int/pointer) - 01: int32/native int, C pointer - 10: float - 11: int64 - if wosize = 0, the size word is used for wosize size word --------- - only present if wosize is 0 in the header. Precedes the main header +---------------------------------------------+ | 0 | wosize | +---------------------------------------------+ bits 31 30 0 - bit 31 is used to identify this as a header extension ext word -------- - Only present if ebits is 1. Describes the first 15 members of the object (+ 3 members from fbits) +----------+----------------------------------+ | 0 | wide | extbits | +----------+----------------------------------+ bits 31 30 29 0 - bit 31 is used to identify this as a header extension - wide: determines if 'a types in the first 18 words are wide (64 bit) or narrow (32 bits) - extbits: same as fbits Tuples ------ - Tuples with any fbits on in the header word are automatically wide tuples (64 bit) for the first 3 words - Tuples with ebits are automatically wide tuples for the first 18 words. wide is ignored. Strings ------- - In strings, the last fbit, ebits and noptr function as the string size modifier (currently present at the end of the string). This improves cache locality on large strings. Wosize expands to include the fbits. Arrays ------ - Arrays of integers, int64, floats, C pointers etc can all be handled with the regular array type. - Only 2 lower fbits are needed for the type. The other fbits are joined with wosize. + 64-bit architectures: -------------------------- +-----------+------------------+------+-----+-------+------+ | 1 | wosize| fbits |ebits |noptr| color | tag | +-----------+------------------+------+-----+-------+------+ bits 63 62 43 42 15 14 13 12 11 10 0 - noptr: a structure with no pointers. - fbits: 2 bits per object member - 00: tagged (int/pointer) - 01: int32 - 10: float - 11: int64/native int/C pointer - ebits: use ext word for signaling bits. fbits in header become bottom bits of wosize. ext_word -------- - Only present if ebits is 1. Describes the first 31 members of the object. +---------------------------------------------+ | 0 | - | extbits | +---------------------------------------------+ bits 63 62 61 0 - bit 31 is used to identify this as a header extension - extbits: same as fbits Strings ------- - In strings, the last fbit, ebits and noptr function as the string size modifier (currently present at the end of the string). This improves cache locality on large strings. Wosize expands Arrays ------ - Only the 2 lowest fbits are needed to discriminate the type. All other fbits are joined to wosize. + Tags: ------- - 0: Array tag - 1: Record tag - 2: Tuple tag - 3: Infix tag - 4: Closure tag - 5: Lazy tag - 6: Object tag - 7: Forward tag - 8: Abstract tag - 9: String tag (pfbit type used for size completion) - 10: Primitive value tag (double, int64, int32) - 11: Custom tag - 100: First user tag Comments? Yotam [-- Attachment #2: Type: text/html, Size: 6012 bytes --] ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Caml-list] Proposal: re-design of ocaml headers 2013-10-11 15:48 ` Yotam Barnoy 2014-01-30 20:53 ` Yotam Barnoy @ 2014-02-01 15:27 ` Goswin von Brederlow 1 sibling, 0 replies; 18+ messages in thread From: Goswin von Brederlow @ 2014-02-01 15:27 UTC (permalink / raw) To: Yotam Barnoy; +Cc: Ocaml Mailing List On Fri, Oct 11, 2013 at 11:48:52AM -0400, Yotam Barnoy wrote: > I had an idea based on (or actually copied from) something Goswin mentioned > earlier in our discussion. What if the bits can be used to indicate > embedded values? An embedded value would have a header inside the body of > the parent value, making it possible to get rid of all pointers within that > parent. This would represent a potentially large saving for the GC, as well > as reducing random jumps around memory which are very cache-stressing. Embedded values would be something like this: type p2 = { x : int; y : int; } type p3 = { xy : embedded p2; z : int; } The type p3 would still be a single block and p3.xy.x would not need an extra indirection. Right? I'm not sure this would be too usefull overall. In the above example it would be far better to have a row type like in objects. Something like type p3 = { p2 with z : int; } That would even allow using p3.x instead of p3.xy.x. It would also allow passing a p3 record to a function expecting a p2 record. With embedded types that would not be possible. An embedded type would have to be passed to other functions as pointer to the begining of the parent and offset within. Another thing is, as recently mentioned, that the compiler aparently aggregates allocations. So if you write let p2 = { x = 1; y = 2; } in let p3 = { xy = p2; z = 3; } in that results in a single allocation and p2 and p3 will be cache local. I think they will even stay next to each other during compression. Right? Embedding would get rid of the xy pointer and indirection but would it help so much? Would it be used widely? Given that embedded and not embedded types are incompatible I think this would rather be limited to stay within modules and never cross the module boundary. For example an internal hashtable would make use of it. And why have extra headers for the embedded values? Incorporate them into the main header directly and make type p3 = { xy : embedded p2; z : int; } equivalent to type p3 = { xy.x : int; xy.y: int; z : int; } A block containing 3 values. Embedding extra headers would only be usefull when embedding largish float arrays (which don't need a bit for every member) into large records. And then why not use the indifection? Given then size it probably will be negible. And for small structures use the individual bits to mark every float member independently. So overally I'm doubtfull the added complexity to recurse into embedded headers is worth it. MfG Goswin ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Caml-list] Proposal: re-design of ocaml headers 2013-09-27 14:05 [Caml-list] Proposal: re-design of ocaml headers Yotam Barnoy 2013-09-27 15:08 ` Dmitry Grebeniuk 2013-09-30 14:48 ` Goswin von Brederlow @ 2013-10-06 10:39 ` Florian Weimer 2 siblings, 0 replies; 18+ messages in thread From: Florian Weimer @ 2013-10-06 10:39 UTC (permalink / raw) To: Yotam Barnoy; +Cc: Ocaml Mailing List * Yotam Barnoy: > The pfbits (as shown below) are an efficient way of representing both > floats and pointers in a data structure, at the cost of disallowing random > access. From what I can gather, random access to this data is never needed, > since the GC, Marshal module, and polymorphic comparison all process the > whole data structure rather than referring to specific parts of it. I wonder if it could make sense to get rid of polymorphic comparison and make Marshal type-safe before changing the header. If the header is needed for GC only and for telling variants apart, it would not be necessary to discriminate between strings, floats and custom memory blocks in the header. That would give room for 28 bits for the tag/size combination or the string or value array length, with 2 bits used by the GC colors and 2 bits for the type bits (telling apart non-scanned blobs, scanned arrays of values, scanned tagged variants, and unscanned tagged variants). Arrays are separate, so 28 bits should be plenty of room for both tag and variant record length. This way, it would be possible to encode both string and array lengths directly in the header, speeding up bounds checks. The GC would have check the type bits to compute the actual object size from the length in the header, but mutator performance seems more important, and perhaps we can come up with a clever way to decode the length without branching. It might even make sense to store the actual number of elements of a double array in the header, again to speed up bounds checks. If the 32-bit 256 MB string size limit is too onerous, it should be possible to bump it to 512 MB, but that will complicate header decoding further. But it's possible to index only 1 GB, so 256 MB is already pretty close to that hard limit. ^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2014-02-01 15:27 UTC | newest] Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2013-09-27 14:05 [Caml-list] Proposal: re-design of ocaml headers Yotam Barnoy 2013-09-27 15:08 ` Dmitry Grebeniuk [not found] ` <CAN6ygOmuCX6HLfSns0tXQCF3LWMANqhpnSN0vGWcNg0one2QzQ@mail.gmail.com> 2013-09-27 15:25 ` [Caml-list] Fwd: " Yotam Barnoy 2013-09-27 16:20 ` Dmitry Grebeniuk 2013-09-27 18:08 ` Yotam Barnoy 2013-09-27 18:12 ` Yotam Barnoy 2013-09-27 18:15 ` Paolo Donadeo 2013-09-27 18:41 ` Yotam Barnoy 2013-09-27 15:31 ` [Caml-list] " Anthony Tavener 2013-09-27 15:37 ` Yotam Barnoy 2013-09-27 16:50 ` Dmitry Grebeniuk 2013-09-30 14:48 ` Goswin von Brederlow 2013-09-30 15:31 ` Yotam Barnoy 2013-10-08 10:52 ` Goswin von Brederlow 2013-10-11 15:48 ` Yotam Barnoy 2014-01-30 20:53 ` Yotam Barnoy 2014-02-01 15:27 ` Goswin von Brederlow 2013-10-06 10:39 ` Florian Weimer
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox