* [Caml-list] Expanding the Float Array tag @ 2013-09-16 15:26 Yotam Barnoy 2013-09-16 16:29 ` Markus Mottl 2013-09-17 9:32 ` Gerd Stolpmann 0 siblings, 2 replies; 14+ messages in thread From: Yotam Barnoy @ 2013-09-16 15:26 UTC (permalink / raw) To: Ocaml Mailing List [-- Attachment #1: Type: text/plain, Size: 738 bytes --] Having looked through some of the ocaml runtime's code, I have a question regarding the Double_array block tag. Why not use a single tag for all block content that doesn't contain pointers instead? This would allow optimization of all cases where no pointers are involved, including float tuples, records with ints, bools and floats etc. The only use-case I've seen so far for Double_array tags is for polymorphic comparison ie. we need the type information to parse doubles correctly. However, the only default comparison that's valid on an array of anything is an equality comparison, which is easily doable without type information. Therefore, I'm confused as to why this is necessary. Thanks in advance for any answers Yotam Barnoy [-- Attachment #2: Type: text/html, Size: 830 bytes --] ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Caml-list] Expanding the Float Array tag 2013-09-16 15:26 [Caml-list] Expanding the Float Array tag Yotam Barnoy @ 2013-09-16 16:29 ` Markus Mottl 2013-09-16 16:49 ` Yotam Barnoy 2013-09-17 9:32 ` Gerd Stolpmann 1 sibling, 1 reply; 14+ messages in thread From: Markus Mottl @ 2013-09-16 16:29 UTC (permalink / raw) To: Yotam Barnoy; +Cc: Ocaml Mailing List On Mon, Sep 16, 2013 at 11:26 AM, Yotam Barnoy <yotambarnoy@gmail.com> wrote: > Having looked through some of the ocaml runtime's code, I have a question > regarding the Double_array block tag. Why not use a single tag for all block > content that doesn't contain pointers instead? This would allow optimization > of all cases where no pointers are involved, including float tuples, records > with ints, bools and floats etc. Floats are a little tricky, because they are always doubles (64 bits), even on 32-bit platforms. This requires some special identification. Adding new tags by reducing the "No_scan_tag" might be a bad (and not backward compatible) approach, too: the maximum number of non-constant constructors is already pretty low at 246. I think this number is too small these days where 64-bit platforms are standard. It's probably hard to change this design decision now by reducing the overly generous maximum "wosize". Some automatically generated APIs can easily blow the current limit on non-constant constructors and require annoying, less efficient workarounds. I guess it might be possible to allocate blocks that are known to be all integers or atomic sum types by using the already available Abstract_tag. Large arrays would benefit most from that. Doing this in the compiler might break old marshaled data. But if performance is really critical, I could imagine trying that in a library with well-hidden type representations. Regards, Markus -- Markus Mottl http://www.ocaml.info markus.mottl@gmail.com ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Caml-list] Expanding the Float Array tag 2013-09-16 16:29 ` Markus Mottl @ 2013-09-16 16:49 ` Yotam Barnoy 2013-09-16 17:14 ` Markus Mottl 2013-09-19 9:40 ` Goswin von Brederlow 0 siblings, 2 replies; 14+ messages in thread From: Yotam Barnoy @ 2013-09-16 16:49 UTC (permalink / raw) To: Markus Mottl; +Cc: Ocaml Mailing List [-- Attachment #1: Type: text/plain, Size: 2772 bytes --] Why do doubles need special handling though, even on a 32-bit system? My suggestion is that the Double_tag be changed to Flat_tag, meaning that all non-pointer objects can reside in this tag. The only issue I've found so far is that polymorphic <, <=, > and >= would not work. However, these operators should not be allowed on a vector anyway since there is no natural ordering scheme for vectors. If there are other issues, please let me know. I agree regarding the expansion of 246 constructors. This must have been kept for compatibility with 32 bit systems. I think what should happen in 32 bit systems is that one constructor should be reserved for having >246 constructors, in which case another word of memory could be utilized for the constructor code. In fact, you'd only need to use that extra word if the particular constructor exceeds 246. In 64 bit systems, the constructor count could easily be increased by a few bits, with the same backup mechanism for when you have more than X constructors (X being the maximum number of constructors). Regards, Yotam On Mon, Sep 16, 2013 at 12:29 PM, Markus Mottl <markus.mottl@gmail.com>wrote: > On Mon, Sep 16, 2013 at 11:26 AM, Yotam Barnoy <yotambarnoy@gmail.com> > wrote: > > Having looked through some of the ocaml runtime's code, I have a question > > regarding the Double_array block tag. Why not use a single tag for all > block > > content that doesn't contain pointers instead? This would allow > optimization > > of all cases where no pointers are involved, including float tuples, > records > > with ints, bools and floats etc. > > Floats are a little tricky, because they are always doubles (64 bits), > even on 32-bit platforms. This requires some special identification. > > Adding new tags by reducing the "No_scan_tag" might be a bad (and not > backward compatible) approach, too: the maximum number of non-constant > constructors is already pretty low at 246. I think this number is too > small these days where 64-bit platforms are standard. It's probably > hard to change this design decision now by reducing the overly > generous maximum "wosize". Some automatically generated APIs can > easily blow the current limit on non-constant constructors and require > annoying, less efficient workarounds. > > I guess it might be possible to allocate blocks that are known to be > all integers or atomic sum types by using the already available > Abstract_tag. Large arrays would benefit most from that. Doing this > in the compiler might break old marshaled data. But if performance is > really critical, I could imagine trying that in a library with > well-hidden type representations. > > Regards, > Markus > > -- > Markus Mottl http://www.ocaml.info markus.mottl@gmail.com > [-- Attachment #2: Type: text/html, Size: 3512 bytes --] ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Caml-list] Expanding the Float Array tag 2013-09-16 16:49 ` Yotam Barnoy @ 2013-09-16 17:14 ` Markus Mottl 2013-09-16 19:09 ` Yotam Barnoy 2013-09-19 9:40 ` Goswin von Brederlow 1 sibling, 1 reply; 14+ messages in thread From: Markus Mottl @ 2013-09-16 17:14 UTC (permalink / raw) To: Yotam Barnoy; +Cc: Ocaml Mailing List On Mon, Sep 16, 2013 at 12:49 PM, Yotam Barnoy <yotambarnoy@gmail.com> wrote: > Why do doubles need special handling though, even on a 32-bit system? My > suggestion is that the Double_tag be changed to Flat_tag, meaning that all > non-pointer objects can reside in this tag. The only issue I've found so far > is that polymorphic <, <=, > and >= would not work. However, these operators > should not be allowed on a vector anyway since there is no natural ordering > scheme for vectors. If there are other issues, please let me know. Here is a problem: If you marshal a float array on a 64-bit platform, how is the 32-bit platform supposed to know about the "logical" and "physical" size of the array? On 64-bit platforms where everything is the same size the distinction wouldn't matter, of course, but on 32-bit it does. The header word can only provide one size (but tag distinctions). > I agree regarding the expansion of 246 constructors. This must have been > kept for compatibility with 32 bit systems. I think what should happen in 32 > bit systems is that one constructor should be reserved for having >246 > constructors, in which case another word of memory could be utilized for the > constructor code. In fact, you'd only need to use that extra word if the > particular constructor exceeds 246. Appending such a "long" constructor to the end of the block is probably doable with acceptable complexity. In that position it should not require many changes elsewhere in the runtime system or code generators. On 64-bit platforms I'd still prefer seeing a change of the header field layout for efficiency reasons even though this would likely make a lot of "marshal" users cry and maybe require more effort to implement. But I wouldn't mind just using "extended" blocks with the current header layout there, too, if this made everybody happy. Regards, Markus -- Markus Mottl http://www.ocaml.info markus.mottl@gmail.com ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Caml-list] Expanding the Float Array tag 2013-09-16 17:14 ` Markus Mottl @ 2013-09-16 19:09 ` Yotam Barnoy 2013-09-17 0:31 ` Yotam Barnoy 0 siblings, 1 reply; 14+ messages in thread From: Yotam Barnoy @ 2013-09-16 19:09 UTC (permalink / raw) To: Markus Mottl; +Cc: Ocaml Mailing List [-- Attachment #1: Type: text/plain, Size: 1659 bytes --] That's a good point. Another relatively easy optimization would be to use a bit from the header on 64-bit platforms (32-bit platforms have no available bits) to indicate another form of extension, whereby an extra word is used as a bitmap to indicate which words are floats. Haskell uses a similar trick to indicate which words are pointers on the stack. This would remove the indirection of floats in the majority of cases, except of course in the stack itself. This shouldn't have an impact on marshaling. BTW bits in the 64-bit header should probably have been marked as reserved rather than making the wo_size field impossibly large. Yotam On Mon, Sep 16, 2013 at 1:14 PM, Markus Mottl <markus.mottl@gmail.com>wrote: > On Mon, Sep 16, 2013 at 12:49 PM, Yotam Barnoy <yotambarnoy@gmail.com> > wrote: > > Why do doubles need special handling though, even on a 32-bit system? My > > suggestion is that the Double_tag be changed to Flat_tag, meaning that > all > > non-pointer objects can reside in this tag. The only issue I've found so > far > > is that polymorphic <, <=, > and >= would not work. However, these > operators > > should not be allowed on a vector anyway since there is no natural > ordering > > scheme for vectors. If there are other issues, please let me know. > > Here is a problem: If you marshal a float array on a 64-bit platform, > how is the 32-bit platform supposed to know about the "logical" and > "physical" size of the array? On 64-bit platforms where everything is > the same size the distinction wouldn't matter, of course, but on > 32-bit it does. The header word can only provide one size (but tag > distinctions). > [-- Attachment #2: Type: text/html, Size: 2246 bytes --] ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Caml-list] Expanding the Float Array tag 2013-09-16 19:09 ` Yotam Barnoy @ 2013-09-17 0:31 ` Yotam Barnoy 0 siblings, 0 replies; 14+ messages in thread From: Yotam Barnoy @ 2013-09-17 0:31 UTC (permalink / raw) To: Markus Mottl; +Cc: Ocaml Mailing List [-- Attachment #1: Type: text/plain, Size: 2322 bytes --] At the same time, there's something to be said for not having this functionality in the runtime system at all. Why should marshaling even be done at a layer that lacks proper type information? What if we wanted to make data representation more efficient by packing booleans together under some circumstances? The marshal code wouldn't be able to handle it because it has no type information. I would much rather see a transition to Core's bin_prot ie. generated, typed code and forget about the marshal module altogether. Yotam On Mon, Sep 16, 2013 at 3:09 PM, Yotam Barnoy <yotambarnoy@gmail.com> wrote: > That's a good point. > > Another relatively easy optimization would be to use a bit from the header > on 64-bit platforms (32-bit platforms have no available bits) to indicate > another form of extension, whereby an extra word is used as a bitmap to > indicate which words are floats. Haskell uses a similar trick to indicate > which words are pointers on the stack. This would remove the indirection of > floats in the majority of cases, except of course in the stack itself. > This shouldn't have an impact on marshaling. > > BTW bits in the 64-bit header should probably have been marked as reserved > rather than making the wo_size field impossibly large. > > Yotam > > > > On Mon, Sep 16, 2013 at 1:14 PM, Markus Mottl <markus.mottl@gmail.com>wrote: > >> On Mon, Sep 16, 2013 at 12:49 PM, Yotam Barnoy <yotambarnoy@gmail.com> >> wrote: >> > Why do doubles need special handling though, even on a 32-bit system? My >> > suggestion is that the Double_tag be changed to Flat_tag, meaning that >> all >> > non-pointer objects can reside in this tag. The only issue I've found >> so far >> > is that polymorphic <, <=, > and >= would not work. However, these >> operators >> > should not be allowed on a vector anyway since there is no natural >> ordering >> > scheme for vectors. If there are other issues, please let me know. >> >> Here is a problem: If you marshal a float array on a 64-bit platform, >> how is the 32-bit platform supposed to know about the "logical" and >> "physical" size of the array? On 64-bit platforms where everything is >> the same size the distinction wouldn't matter, of course, but on >> 32-bit it does. The header word can only provide one size (but tag >> distinctions). >> > [-- Attachment #2: Type: text/html, Size: 3342 bytes --] ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Caml-list] Expanding the Float Array tag 2013-09-16 16:49 ` Yotam Barnoy 2013-09-16 17:14 ` Markus Mottl @ 2013-09-19 9:40 ` Goswin von Brederlow 1 sibling, 0 replies; 14+ messages in thread From: Goswin von Brederlow @ 2013-09-19 9:40 UTC (permalink / raw) To: caml-list On Mon, Sep 16, 2013 at 12:49:33PM -0400, Yotam Barnoy wrote: > Why do doubles need special handling though, even on a 32-bit system? My > suggestion is that the Double_tag be changed to Flat_tag, meaning that all > non-pointer objects can reside in this tag. The only issue I've found so > far is that polymorphic <, <=, > and >= would not work. However, these > operators should not be allowed on a vector anyway since there is no > natural ordering scheme for vectors. If there are other issues, please let > me know. Vectors are naturally ordered by the order of their first element that isn't equal. The Double_tag is also required for polymorphic access to the field since on 32 bit the index needs to be doubled and on all archs the float is boxed. MfG Goswin ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Caml-list] Expanding the Float Array tag 2013-09-16 15:26 [Caml-list] Expanding the Float Array tag Yotam Barnoy 2013-09-16 16:29 ` Markus Mottl @ 2013-09-17 9:32 ` Gerd Stolpmann 2013-09-18 15:10 ` Yotam Barnoy 1 sibling, 1 reply; 14+ messages in thread From: Gerd Stolpmann @ 2013-09-17 9:32 UTC (permalink / raw) To: Yotam Barnoy; +Cc: Ocaml Mailing List [-- Attachment #1: Type: text/plain, Size: 1817 bytes --] Am Montag, den 16.09.2013, 11:26 -0400 schrieb Yotam Barnoy: > Having looked through some of the ocaml runtime's code, I have a > question regarding the Double_array block tag. Why not use a single > tag for all block content that doesn't contain pointers instead? This > would allow optimization of all cases where no pointers are involved, > including float tuples, records with ints, bools and floats etc. > > The only use-case I've seen so far for Double_array tags is for > polymorphic comparison ie. we need the type information to parse > doubles correctly. However, the only default comparison that's valid > on an array of anything is an equality comparison, which is easily > doable without type information. Therefore, I'm confused as to why > this is necessary. You also need the Double_array tags for normal array accesses on 32 bit platforms: If you call a polymorphic function taking an array argument, the function doesn't know whether it is called with a float array or a normal array. Because of this, the compiler generates a dynamic check whether the array is float or something else. For float, every element is 64 bits wide, but for anything else it is 32 bits only. You are right that a special "no-scan" tag would speed up the GC marking phase when there are large arrays profiting from it - for all other no-scan cases except float. Gerd > > Thanks in advance for any answers > > Yotam Barnoy > -- ------------------------------------------------------------ Gerd Stolpmann, Darmstadt, Germany gerd@gerd-stolpmann.de My OCaml site: http://www.camlcity.org Contact details: http://www.camlcity.org/contact.html Company homepage: http://www.gerd-stolpmann.de ------------------------------------------------------------ [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 490 bytes --] ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Caml-list] Expanding the Float Array tag 2013-09-17 9:32 ` Gerd Stolpmann @ 2013-09-18 15:10 ` Yotam Barnoy 2013-09-19 6:18 ` oleg 2013-09-19 10:10 ` Goswin von Brederlow 0 siblings, 2 replies; 14+ messages in thread From: Yotam Barnoy @ 2013-09-18 15:10 UTC (permalink / raw) To: Gerd Stolpmann; +Cc: Ocaml Mailing List [-- Attachment #1: Type: text/plain, Size: 2929 bytes --] So here is my tentative proposal: For 32-bit platforms, a specific tag will signify the extra header word. This word will have 16 bits' worth of tag. I think 65000 tags are enough, right? This isn't one of those "128KB will always be enough" kind of thing, is it? The top 16 bits can be used for other things. For example, a constructor with up to 16 words could specify using a 16-bit bitfield which of its members are floats. So a constructor with floats would automatically use the expanded header. An array of records (with no floats) could have the record size specified in 16 bits. For 64-bit platforms, the expanded 16-bit tag could reside in the same header, with one bit extra used to specify the extra header word for any given tag. With 64 bits available, this extra header can be very powerful: it could be used to specify 64 words worth of floats for constructors, or it could specify a 5-bit record size for an array together with 32 bits to specify the floats in the record. Yotam On Tue, Sep 17, 2013 at 5:32 AM, Gerd Stolpmann <info@gerd-stolpmann.de>wrote: > Am Montag, den 16.09.2013, 11:26 -0400 schrieb Yotam Barnoy: > > Having looked through some of the ocaml runtime's code, I have a > > question regarding the Double_array block tag. Why not use a single > > tag for all block content that doesn't contain pointers instead? This > > would allow optimization of all cases where no pointers are involved, > > including float tuples, records with ints, bools and floats etc. > > > > The only use-case I've seen so far for Double_array tags is for > > polymorphic comparison ie. we need the type information to parse > > doubles correctly. However, the only default comparison that's valid > > on an array of anything is an equality comparison, which is easily > > doable without type information. Therefore, I'm confused as to why > > this is necessary. > > You also need the Double_array tags for normal array accesses on 32 bit > platforms: If you call a polymorphic function taking an array argument, > the function doesn't know whether it is called with a float array or a > normal array. Because of this, the compiler generates a dynamic check > whether the array is float or something else. For float, every element > is 64 bits wide, but for anything else it is 32 bits only. > > You are right that a special "no-scan" tag would speed up the GC marking > phase when there are large arrays profiting from it - for all other > no-scan cases except float. > > Gerd > > > > Thanks in advance for any answers > > > > Yotam Barnoy > > > > -- > ------------------------------------------------------------ > Gerd Stolpmann, Darmstadt, Germany gerd@gerd-stolpmann.de > My OCaml site: http://www.camlcity.org > Contact details: http://www.camlcity.org/contact.html > Company homepage: http://www.gerd-stolpmann.de > ------------------------------------------------------------ > [-- Attachment #2: Type: text/html, Size: 3870 bytes --] ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Caml-list] Expanding the Float Array tag 2013-09-18 15:10 ` Yotam Barnoy @ 2013-09-19 6:18 ` oleg 2013-09-19 9:47 ` Goswin von Brederlow 2013-09-19 10:10 ` Goswin von Brederlow 1 sibling, 1 reply; 14+ messages in thread From: oleg @ 2013-09-19 6:18 UTC (permalink / raw) To: yotambarnoy; +Cc: caml-list I am curious about the focus on generalizing double_tag or double_array_tag when there is already a seemingly good candidate for tagging values that should not be scanned. I mean the string_tag. Please search the OCaml code for string_tag and String_tag. You see it is mentioned in generic routines like generic comparison compare.c -- which uses memcpy anyway, thus treating the string just like a sequence of bytes. Other generic functions like hash are similar. The only other non-trivial use of string_tag is in printexc and the likes. Those cases are indeed problematic. Other than them, string_tag can denote arbitrary opaque byte array and noone will notice. Incidentally, if by string we mean UTF-8 or UTF-16 string, it is essentially a sequence of bytes anyway. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Caml-list] Expanding the Float Array tag 2013-09-19 6:18 ` oleg @ 2013-09-19 9:47 ` Goswin von Brederlow 0 siblings, 0 replies; 14+ messages in thread From: Goswin von Brederlow @ 2013-09-19 9:47 UTC (permalink / raw) To: caml-list On Thu, Sep 19, 2013 at 06:18:38AM -0000, oleg@okmij.org wrote: > > I am curious about the focus on generalizing double_tag or > double_array_tag when there is already a seemingly good candidate for > tagging values that should not be scanned. I mean the > string_tag. Please search the OCaml code for string_tag and > String_tag. You see it is mentioned in generic routines like > generic comparison compare.c -- which uses memcpy anyway, thus > treating the string just like a sequence of bytes. Other generic > functions like hash are similar. The only other non-trivial use of > string_tag is in printexc and the likes. Those cases are indeed > problematic. Other than them, string_tag can denote arbitrary opaque > byte array and noone will notice. Incidentally, if by string we mean > UTF-8 or UTF-16 string, it is essentially a sequence of bytes anyway. Strings are special in that they have a length in bytes. The length in words is taken from the header, multiplied by the word size and the delta to the real string length is stored in the last byte of the block. Reusing the string tag would confuse code that relies on this, like marshaling. MfG Goswin ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Caml-list] Expanding the Float Array tag 2013-09-18 15:10 ` Yotam Barnoy 2013-09-19 6:18 ` oleg @ 2013-09-19 10:10 ` Goswin von Brederlow 2013-09-20 2:18 ` Yotam Barnoy 1 sibling, 1 reply; 14+ messages in thread From: Goswin von Brederlow @ 2013-09-19 10:10 UTC (permalink / raw) To: caml-list On Wed, Sep 18, 2013 at 11:10:00AM -0400, Yotam Barnoy wrote: > So here is my tentative proposal: > > For 32-bit platforms, a specific tag will signify the extra header word. > This word will have 16 bits' worth of tag. I think 65000 tags are enough, > right? This isn't one of those "128KB will always be enough" kind of thing, > is it? The top 16 bits can be used for other things. For example, a > constructor with up to 16 words could specify using a 16-bit bitfield which > of its members are floats. So a constructor with floats would automatically > use the expanded header. An array of records (with no floats) could have > the record size specified in 16 bits. > > For 64-bit platforms, the expanded 16-bit tag could reside in the same > header, with one bit extra used to specify the extra header word for any > given tag. With 64 bits available, this extra header can be very powerful: > it could be used to specify 64 words worth of floats for constructors, or > it could specify a 5-bit record size for an array together with 32 bits to > specify the floats in the record. > > Yotam You have 2 issues here: 1) unboxed floats The double_array_tag saves a lot (50%) of ram because the floats are not individually boxed. For mixed blocks a bitfield could also allow unboxed floats. Inspecting the contents of such a block would be more complex then because, on 32bit, a is-float bit means the float is stored in 2 fields and the inspection has to combine the current field with the next and skip over the next field. You want to use a 16-bit bitfield to indicate which members are float. This would work for record and constructors with up to 16 members. I would modify this a bit. The lower 15 bit indicate which of the first 15 members are float while the 16th bit indicates if all remaining members are floats. If you declare the 16-bit value as signed and use arithmetic shift right to iterate through the bits you get this naturally. 2) values the GC doesn't need to scan This would probably be far more often usefull. There are tons of tuples and records that contain only primitive types (int, bool, unit, ...). This is also true for a lot of variant types. So a simple flat_tag would only cover half the cases. A flat bit would be better. On the other hand a bitfield for values the GC doesn't have to inspect seems pointless. Each value already has a bit for that. MfG Goswin ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Caml-list] Expanding the Float Array tag 2013-09-19 10:10 ` Goswin von Brederlow @ 2013-09-20 2:18 ` Yotam Barnoy 2013-09-20 6:25 ` Goswin von Brederlow 0 siblings, 1 reply; 14+ messages in thread From: Yotam Barnoy @ 2013-09-20 2:18 UTC (permalink / raw) To: Goswin von Brederlow; +Cc: Ocaml Mailing List [-- Attachment #1: Type: text/plain, Size: 2368 bytes --] > 1) unboxed floats > > The double_array_tag saves a lot (50%) of ram because the floats are > not individually boxed. For mixed blocks a bitfield could also allow > unboxed floats. > > Inspecting the contents of such a block would be more complex then > because, on 32bit, a is-float bit means the float is stored in 2 > fields and the inspection has to combine the current field with the > next and skip over the next field. > > You want to use a 16-bit bitfield to indicate which members are float. > This would work for record and constructors with up to 16 members. I > would modify this a bit. The lower 15 bit indicate which of the first > 15 members are float while the 16th bit indicates if all remaining > members are floats. If you declare the 16-bit value as signed and use > arithmetic shift right to iterate through the bits you get this > naturally. > > This sounds good. On 64 bit platforms, of course, you could use the second double word to indicate many more unboxed floats. And let's not forget that this impacts performance as well as memory usage. > 2) values the GC doesn't need to scan > > This would probably be far more often usefull. There are tons of > tuples and records that contain only primitive types (int, bool, unit, > ...). This is also true for a lot of variant types. So a simple > flat_tag would only cover half the cases. A flat bit would be better. > On the other hand a bitfield for values the GC doesn't have to inspect > seems pointless. Each value already has a bit for that. > Yeah. As soon as I start going down this path, I start thinking of haskell's heap solution, which is so elegant: a size field indicates how many pointers there are in the data structure, and all pointers are moved to the beginning of the structure. This allows them to have full size integers as well as unboxed floats. The cost is a less intuitive object layout. Of course, they don't have to worry about polymorphic compare or marshaling, since typeclasses cover all of these things in the form of attached dictionaries. Think of how much work they're saving the garbage collector though: they never have to read non-pointer values. If only haskell wasn't lazy :) ... BTW can anyone think of a downside to haskell's approach (other than the fact that ocaml needs to keep track of which values are floats for polymorphic reasons)? Yotam [-- Attachment #2: Type: text/html, Size: 2901 bytes --] ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Caml-list] Expanding the Float Array tag 2013-09-20 2:18 ` Yotam Barnoy @ 2013-09-20 6:25 ` Goswin von Brederlow 0 siblings, 0 replies; 14+ messages in thread From: Goswin von Brederlow @ 2013-09-20 6:25 UTC (permalink / raw) To: Yotam Barnoy; +Cc: Ocaml Mailing List On Thu, Sep 19, 2013 at 10:18:30PM -0400, Yotam Barnoy wrote: > > 1) unboxed floats > > > > The double_array_tag saves a lot (50%) of ram because the floats are > > not individually boxed. For mixed blocks a bitfield could also allow > > unboxed floats. > > > > Inspecting the contents of such a block would be more complex then > > because, on 32bit, a is-float bit means the float is stored in 2 > > fields and the inspection has to combine the current field with the > > next and skip over the next field. > > > > You want to use a 16-bit bitfield to indicate which members are float. > > This would work for record and constructors with up to 16 members. I > > would modify this a bit. The lower 15 bit indicate which of the first > > 15 members are float while the 16th bit indicates if all remaining > > members are floats. If you declare the 16-bit value as signed and use > > arithmetic shift right to iterate through the bits you get this > > naturally. > > > > This sounds good. On 64 bit platforms, of course, you could use the second > double word to indicate many more unboxed floats. And let's not forget that > this impacts performance as well as memory usage. > > > > > 2) values the GC doesn't need to scan > > > > This would probably be far more often usefull. There are tons of > > tuples and records that contain only primitive types (int, bool, unit, > > ...). This is also true for a lot of variant types. So a simple > > flat_tag would only cover half the cases. A flat bit would be better. > > On the other hand a bitfield for values the GC doesn't have to inspect > > seems pointless. Each value already has a bit for that. > > > > Yeah. As soon as I start going down this path, I start thinking of > haskell's heap solution, which is so elegant: a size field indicates how > many pointers there are in the data structure, and all pointers are moved > to the beginning of the structure. This allows them to have full size > integers as well as unboxed floats. The cost is a less intuitive object > layout. Of course, they don't have to worry about polymorphic compare or > marshaling, since typeclasses cover all of these things in the form of > attached dictionaries. > > Think of how much work they're saving the garbage collector though: they > never have to read non-pointer values. If only haskell wasn't lazy :) ... > BTW can anyone think of a downside to haskell's approach (other than the > fact that ocaml needs to keep track of which values are floats for > polymorphic reasons)? > > Yotam You could save the number of pointer, floats and other. But comparison needs to know the order in which fields are declared. If you sort pointers to the front then you need something to unsort them for every comparison, not just for polymorphic compare and floats. Also polymorphic types would have different order depending on the actual type: # let f (x, y) = x;; val f : 'a * 'b -> 'a = <fun> # f (1, (1, 1));; - : int = 1 # f ((1, 1), 1);; - : int * int = (1, 1) How would the compiler implement f? Since (1, 1) is a pointer it would be sorted to the front so both invocations of f would get the same data. You would have to include an index mapping for every field and access fields indirectly through that. That would cost memory as well as slow down every access. MfG Goswin ^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2013-09-20 6:25 UTC | newest] Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2013-09-16 15:26 [Caml-list] Expanding the Float Array tag Yotam Barnoy 2013-09-16 16:29 ` Markus Mottl 2013-09-16 16:49 ` Yotam Barnoy 2013-09-16 17:14 ` Markus Mottl 2013-09-16 19:09 ` Yotam Barnoy 2013-09-17 0:31 ` Yotam Barnoy 2013-09-19 9:40 ` Goswin von Brederlow 2013-09-17 9:32 ` Gerd Stolpmann 2013-09-18 15:10 ` Yotam Barnoy 2013-09-19 6:18 ` oleg 2013-09-19 9:47 ` Goswin von Brederlow 2013-09-19 10:10 ` Goswin von Brederlow 2013-09-20 2:18 ` Yotam Barnoy 2013-09-20 6:25 ` Goswin von Brederlow
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox