From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail2-relais-roc.national.inria.fr (mail2-relais-roc.national.inria.fr [192.134.164.83]) by sympa.inria.fr (Postfix) with ESMTPS id D7B257EE4B for ; Tue, 8 Oct 2013 12:52:49 +0200 (CEST) Received-SPF: None (mail2-smtp-roc.national.inria.fr: no sender authenticity information available from domain of goswin-v-b@web.de) identity=pra; client-ip=212.227.17.11; receiver=mail2-smtp-roc.national.inria.fr; envelope-from="goswin-v-b@web.de"; x-sender="goswin-v-b@web.de"; x-conformance=sidf_compatible Received-SPF: None (mail2-smtp-roc.national.inria.fr: no sender authenticity information available from domain of goswin-v-b@web.de) identity=mailfrom; client-ip=212.227.17.11; receiver=mail2-smtp-roc.national.inria.fr; envelope-from="goswin-v-b@web.de"; x-sender="goswin-v-b@web.de"; x-conformance=sidf_compatible Received-SPF: None (mail2-smtp-roc.national.inria.fr: no sender authenticity information available from domain of postmaster@mout.web.de) identity=helo; client-ip=212.227.17.11; receiver=mail2-smtp-roc.national.inria.fr; envelope-from="goswin-v-b@web.de"; x-sender="postmaster@mout.web.de"; x-conformance=sidf_compatible X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AgYKACvjU1LU4xELZWdsb2JhbABZv3eFPIEfFg4LCwsJDy2CJQEBBTo/EAsYCSUPBSghE4dzAQMTsCAfKyKJVo9BB4MfgQQDmACGI48E X-IPAS-Result: AgYKACvjU1LU4xELZWdsb2JhbABZv3eFPIEfFg4LCwsJDy2CJQEBBTo/EAsYCSUPBSghE4dzAQMTsCAfKyKJVo9BB4MfgQQDmACGI48E X-IronPort-AV: E=Sophos;i="4.90,1056,1371074400"; d="scan'208";a="36043238" Received: from mout.web.de ([212.227.17.11]) by mail2-smtp-roc.national.inria.fr with ESMTP/TLS/DHE-RSA-AES128-SHA; 08 Oct 2013 12:52:48 +0200 Received: from frosties.localnet ([95.208.119.3]) by smtp.web.de (mrweb102) with ESMTPSA (Nemesis) id 0LbrUu-1WCQ322Q1E-00jEoW for ; Tue, 08 Oct 2013 12:52:48 +0200 Received: from mrvn by frosties.localnet with local (Exim 4.80) (envelope-from ) id 1VTUuE-0004AD-8G; Tue, 08 Oct 2013 12:52:46 +0200 Date: Tue, 8 Oct 2013 12:52:46 +0200 From: Goswin von Brederlow To: Yotam Barnoy Cc: Ocaml Mailing List Message-ID: <20131008105246.GA15550@frosties> References: <20130930144842.GE8693@frosties> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) X-Provags-ID: V03:K0:G74wcCYAEdh/4UL98zwAhkIrBbgHxkdURQYQEMCAnMFjqV8qfm+ ofOADTaSGCmGkkrksNtx2d/tl/TUhvlhFFop1MsalQxyHK4eliFYub1BkMLgoRFjDUBswYy bDd4bBiTI/csnRQQYJ334a1EEQE088vpOBxQ+64OsC3pc0qmMlcpaw+N1rN5/rKpAHEy7nh OYWnyYfDADtT57UDoZuMw== Subject: Re: [Caml-list] Proposal: re-design of ocaml headers On Mon, Sep 30, 2013 at 11:31:23AM -0400, Yotam Barnoy wrote: > On Mon, Sep 30, 2013 at 10:48 AM, Goswin von Brederlow wrote: > > > > > > > + For 16-bit and 32-bit architectures: > > > +---------------+----+----+-----+-------+------+ > > > | wosize | ext|cust|noptr| color | tag | > > > +---------------+----+----+-----+-------+------+ > > > bits 31 21 20 19 18 17 16 15 0 > > > > > > - noptr: no pointers present > > > - ext: uses extension word > > > - cust(om): uses custom word. Custom word is normally used to indicate > > > floats and pointers. > > > > > > 32 bit extension word (present only if ext is 1) > > > +---------------------------------------------+ > > > | wosize | > > > +---------------------------------------------+ > > > bits 31 0 > > > > Why use a full bit for ext? I would define wosize == 0 to mean an > > extension word with the actual size is present. That way sizes up to > > <16KB can be encoded without extension word. > > > > > Great point! Of course, that makes perfect sense. I was feeling like I was > wasting the wosize bits with the extension word but couldn't quite get put > 2 and 2 together. > BTW, down the thread is a newer version of the design that reduces the tag > space to 8000 tags, which I do think is sufficient. > > > > > > 32 bit custom word (default usage - present only if cust is 1): > > > +----+----------------------------------------+ > > > |nofp| pfbits | > > > +----+----------------------------------------+ > > > bits 31 30 0 > > > > > > - nofp: a structure with no floats. All pfbits are used for pointers, > > with > > > a 1 signifying a pointer and a 0 signifying a value. > > > - pfbits: indicates which double words are floats and pointers. Starting > > at > > > the highest bit: > > > - a 0 indicates neither a pointer nor a float > > > - a 10 indicates a float (double) > > > - a 11 indicates a pointer > > > - If noptr is set, each bit indicates a float. If nofp is set, each > > bit > > > indicates a pointer. > > > > There are 3 kinds of values: > > > > 1) pointers with bit 0 == 0 > > 2) non-pointers with bit 0 == 1 > > 3) floats with all bits used for the type (spanning 2 fields in 32bit) > > > > So if pfbits indicates a float then a field (or 2) is a float and all > > bits are used for the value. Otherwise the bit 0 of the field will > > tell you wether it is a pointer or not. So why would you want to > > duplicate that information in the pfbits? > > > > I was thinking of doing it for efficiency. If we're already indicating > what's what, we might as well represent shortcuts to the pointers, which > would cut down on the amount of reading, no? In the average case, the GC > would need to access a lot less memory. > > > > It might be nice to support C values like untagged ints or unaligned > > pointers. If Custom tag is set then the pfbits become ocaml value > > bits. The GC will only inspect fields with pfbit set. All other fields > > are ignored. The custom_operations handle compare, hash, serialize and > > deserialize so nothing else will access the data. > > > > Another thing are int32 and int64. I guess if you want to unbox those > > then having 2 bits per field in pfbits makes sense again. But then I > > would allocate them as: > > > > - a 00 indicates a tagged value (int or pointer) > > - a 01 indicates a non-pointer: int, int32, native int, C pointer > > - a 10 indicates a float (double) > > - a 11 indicates an int64 > > > > The higher bit would indicate a 64bit value, meaning spanning 2 fields > > on 32bit. Not that those 4 values allow mixing ocaml values, C values, > > int32, int64 and float in a block. > > > > I would combine the noptr and nofp bits into a single 2bit field: > > > > - a 00 indicates no pointers and no double size, no pfbits > > - a 01 indicates no double size, pfbits indicate tagged / non-pointer > > - a 10 indicates no pointers but double size, pfbits indicate size > > - a 11 indicates both pointers and double size, 2 pfbits per field > > > > Note: tagged integers can be stored as 00 or 01. I think this would be > > required for polymorphic types. An 'a could be int or pointer. In both > > cases 00 will work. > > > > > I really like this idea -- unboxing more types could be really useful. I'm > not sure double 'size' would work, however. It should be fine for the > marshal module, but polymorphic comparison would get messed up because > floats have to be compared differently. So I think 10 in the bit field > should indicate no pointers but floats, while 11 could allow both pointers > and double size, with the 2-bits specifying if it's a float or an int64 (as > you've outlined). Of course, one cannot have both shortcuts to pointers and > enhanced unboxing, so let me know what you think about the performance > increase from shortcutting the tag bit. > > Yotam Lets look at an example: type 'a r = { a:int; b:float; c:int32; d:int64; e:'a; } For 16-bit and 32-bit architectures: +--------------------+----------+-------+------+ | wosize |pfbit type| color | tag | +--------------------+----------+-------+------+ bits 31 20 19 18 17 16 15 0 wosize = 7 pfbit type = 11 (pointers and double size) +------------------------------+--+--+--+--+--+ | pfbits |00|11|01|10|01| +------------------------------+--+--+--+--+--+ e d c b a The GC only needs to check e since 'a might be a pointer. All other fields are marked as non pointer. Comparison does a plain bit comparison on a, c and d, a float comparison on b and a tagged comparison on e. Similar for marshaling. There is no confusion between int64 and floats. MfG Goswin