From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <yotambarnoy@gmail.com>
Received: from mail3-relais-sop.national.inria.fr (mail3-relais-sop.national.inria.fr [192.134.164.104])
	by sympa.inria.fr (Postfix) with ESMTPS id 477907EE4B
	for <caml-list@sympa.inria.fr>; Fri, 11 Oct 2013 17:49:15 +0200 (CEST)
Received-SPF: None (mail3-smtp-sop.national.inria.fr: no sender
  authenticity information available from domain of
  yotambarnoy@gmail.com) identity=pra; client-ip=209.85.128.45;
  receiver=mail3-smtp-sop.national.inria.fr;
  envelope-from="yotambarnoy@gmail.com";
  x-sender="yotambarnoy@gmail.com";
  x-conformance=sidf_compatible
Received-SPF: Pass (mail3-smtp-sop.national.inria.fr: domain of
  yotambarnoy@gmail.com designates 209.85.128.45 as permitted
  sender) identity=mailfrom; client-ip=209.85.128.45;
  receiver=mail3-smtp-sop.national.inria.fr;
  envelope-from="yotambarnoy@gmail.com";
  x-sender="yotambarnoy@gmail.com";
  x-conformance=sidf_compatible; x-record-type="v=spf1"
Received-SPF: None (mail3-smtp-sop.national.inria.fr: no sender
  authenticity information available from domain of
  postmaster@mail-qe0-f45.google.com) identity=helo;
  client-ip=209.85.128.45;
  receiver=mail3-smtp-sop.national.inria.fr;
  envelope-from="yotambarnoy@gmail.com";
  x-sender="postmaster@mail-qe0-f45.google.com";
  x-conformance=sidf_compatible
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: As4BAIUdWFLRVYAtm2dsb2JhbABZhBHBUIEbCBYOAQEBAQEGCwsJFCiCJQEBBAFAARsdAQMBCwYFCwcGLiIBEQEFAQ4OBhOHcwEDCQacOoxVgwqELgoZJw0VT4kBAQUMjzsHhCMDmAWQFxgphGog
X-IPAS-Result: As4BAIUdWFLRVYAtm2dsb2JhbABZhBHBUIEbCBYOAQEBAQEGCwsJFCiCJQEBBAFAARsdAQMBCwYFCwcGLiIBEQEFAQ4OBhOHcwEDCQacOoxVgwqELgoZJw0VT4kBAQUMjzsHhCMDmAWQFxgphGog
X-IronPort-AV: E=Sophos;i="4.90,1081,1371074400"; 
   d="scan'208";a="29994235"
Received: from mail-qe0-f45.google.com ([209.85.128.45])
  by mail3-smtp-sop.national.inria.fr with ESMTP/TLS/RC4-SHA; 11 Oct 2013 17:49:13 +0200
Received: by mail-qe0-f45.google.com with SMTP id 8so3352532qea.4
        for <caml-list@inria.fr>; Fri, 11 Oct 2013 08:49:12 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=gmail.com; s=20120113;
        h=mime-version:in-reply-to:references:from:date:message-id:subject:to
         :cc:content-type;
        bh=yzQkWiglDZfjE4+wCjmePnTQuc+cayNJ5Iczf84D3Kc=;
        b=idgJhglrox8YBaqpmKLixOBjnjVfLVfqivUoIexJuJvXheToGCSezGPKZnX/pyME2X
         EoVsK77CkXNc0NwG6FaSf/jWHVKIYmKzYroceUmatqJWA2sBkcAxg/ZY9sP+DPkoAOhl
         UsBenxgF+zKt5jQ08bra0UBw4qE+Rg7Tt9RzNW/Ynsx5BboK2BHuiVecD43WRx7ZBC8k
         N3ngdbbJVi0dYkOBhaeYBzrtY+6XQmGIrl5wFHOwH9I6Q1bLf9ifn83Uk+YsZW+YPKVs
         9vbKhkdj/7NJyfgQO2CIdGz8TY7Wng9Ou/RwSCxug+iyRoRbsLdtQ9XAnO6/nV8cj1ZY
         VQXg==
X-Received: by 10.49.47.84 with SMTP id b20mr8626392qen.83.1381506552614; Fri,
 11 Oct 2013 08:49:12 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.224.139.20 with HTTP; Fri, 11 Oct 2013 08:48:52 -0700 (PDT)
In-Reply-To: <20131008105246.GA15550@frosties>
References: <CAN6ygO=cnhc039DEOVf7uZqpTCedVO0SMnG+rvFw4hm4qPc7cg@mail.gmail.com>
 <20130930144842.GE8693@frosties> <CAN6ygOnmk_EGViZR_tmHuz+cjmevQiyeS9XeHUpCWDcGhkwFMg@mail.gmail.com>
 <20131008105246.GA15550@frosties>
From: Yotam Barnoy <yotambarnoy@gmail.com>
Date: Fri, 11 Oct 2013 11:48:52 -0400
Message-ID: <CAN6ygOk4OLJN+Cvsips6nic16d7ZTomwceWPXX91O-S0k5xzKA@mail.gmail.com>
To: Goswin von Brederlow <goswin-v-b@web.de>
Cc: Ocaml Mailing List <caml-list@inria.fr>
Content-Type: multipart/alternative; boundary=047d7b33d37452d6ad04e8790f0f
Subject: Re: [Caml-list] Proposal: re-design of ocaml headers


--047d7b33d37452d6ad04e8790f0f
Content-Type: text/plain; charset=ISO-8859-1

I had an idea based on (or actually copied from) something Goswin mentioned
earlier in our discussion. What if the bits can be used to indicate
embedded values? An embedded value would have a header inside the body of
the parent value, making it possible to get rid of all pointers within that
parent. This would represent a potentially large saving for the GC, as well
as reducing random jumps around memory which are very cache-stressing.

In the spirit of this idea, here is my latest version:

+ For 16-bit and 32-bit architectures:
     +---------------+-------------+-----+-------+------+
     |       wosize  |pfbits type  |noptr| color | tag  |
     +---------------+-------------+-----+-------+------+
bits  31           19  18   17   16   15  14   13 12   0

pfbits type:
- 000: no pfbits
- 001: pfbits indicate embedded header
- 010: pfbits indicate float
- 011: pfbits indicate untagged type
- 100: pfbits indicate int64
- 101: pfbits indicate float/embdedded header/untagged type, 2 bits per
field
- 110: pfbits indicate float/untagged type/int64, 2 bits per field
- 111: pfbits indicate embedded header/untagged type/int64, 2 bits per field

- noptr: no pointers present
- if wosize = 0, the extension word is used for wosize
- if both wosize = 0 and the pfbits are used, the wosize_large is first in
memory

wosize_large word (if wosize is 0 in the header)
     +---------------------------------------------+
     |                   wosize                    |
     +---------------------------------------------+
bits  31                                          0

32 bit pfbits word (present only if called for by pfbits type in header)
     +---------------------------------------------+
     |                   pfbits                    |
     +---------------------------------------------+
bits   31  30                                     0

- pfbits:
    - If working in 1 bit mode, each bit represents whatever is signaled in
the pfbits type field.
    - If working in 2 bit mode, 00 always represents a regular word, and
01, 10, 11 represent their type as signaled in the pfbit type field.


+ For 64-bit architectures:

     +-------------+--------+----------+---+------+-------+------+
     |     pfbits  | wosize |pfbit type|exp| noptr| color | tag  |
     +-------------+--------+----------+---+------+-------+------+
bits  63         40 39    20 19      17 16    15   14   13 12   0


- noptr: a structure with no pointers.
- pfbits: a small pfbits field for smaller objects
- pfbits type: (slightly different than 32-bit architecture)
    - 000: no pfbits, wosize includes pfbits as its upper bits
    - 001: pfbits indicate embedded header
    - 010: pfbits indicate float
    - 011: pfbits indicate untagged type
    - 100: pfbits indicate int64
    - 101: pfbits indicate float/untagged type/embedded header, 2 bits per
field
    - 110: pfbits indicate float/untagged type/int64, 2 bits per field
    - 111: pfbits indicate int64/untagged type/embedded header, 2 bits per
field
- exp: use pfbits_expanded for signaling pfbits. Pfbits in header become
top bits of wosize.


     +--------------------------------------------------------+
     |                pfbits_expanded                         |
     +--------------------------------------------------------+
bits  63                                                     0

- pfbits_expanded: if exp is set, pfbits_expanded takes the place of the
pfbits. wo_size is joined with the pfbits in the header.

+ Tags:

- 0: Array, record, tuple tag
- 1: Infix tag (must be 1 mod 4)
- 2: Closure tag
- 3: Lazy tag
- 4: Object tag
- 5: Forward tag
- 6: Abstract tag
- 7: String tag
- 8: Double tag
- 9: Custom tag
- 10: Double_array tag
- 11: Proposed: Int32_array tag
- 12: Proposed: Int64_array tag
- 13: Proposed: Cptr_array tag
- 14: Proposed: float32_array tag
- 1000: first user tag

-Yotam


On Tue, Oct 8, 2013 at 6:52 AM, Goswin von Brederlow <goswin-v-b@web.de>wrote:

> On Mon, Sep 30, 2013 at 11:31:23AM -0400, Yotam Barnoy wrote:
> > On Mon, Sep 30, 2013 at 10:48 AM, Goswin von Brederlow <
> goswin-v-b@web.de>wrote:
> >
> > > >
> > > > + For 16-bit and 32-bit architectures:
> > > >      +---------------+----+----+-----+-------+------+
> > > >      |     wosize    | ext|cust|noptr| color | tag  |
> > > >      +---------------+----+----+-----+-------+------+
> > > > bits  31           21  20   19   18   17   16 15   0
> > > >
> > > > - noptr: no pointers present
> > > > - ext:  uses extension word
> > > > - cust(om): uses custom word. Custom word is normally used to
> indicate
> > > > floats and pointers.
> > > >
> > > > 32 bit extension word (present only if ext is 1)
> > > >      +---------------------------------------------+
> > > >      |                   wosize                    |
> > > >      +---------------------------------------------+
> > > > bits  31                                          0
> > >
> > > Why use a full bit for ext? I would define wosize == 0 to mean an
> > > extension word with the actual size is present. That way sizes up to
> > > <16KB can be encoded without extension word.
> > >
> > >
> > Great point! Of course, that makes perfect sense. I was feeling like I
> was
> > wasting the wosize bits with the extension word but couldn't quite get
> put
> > 2 and 2 together.
> > BTW, down the thread is a newer version of the design that reduces the
> tag
> > space to 8000 tags, which I do think is sufficient.
> >
> >
> >
> > >  > 32 bit custom word (default usage - present only if cust is 1):
> > > >      +----+----------------------------------------+
> > > >      |nofp|              pfbits                    |
> > > >      +----+----------------------------------------+
> > > > bits   31  30                                     0
> > > >
> > > > - nofp: a structure with no floats. All pfbits are used for pointers,
> > > with
> > > > a 1 signifying a pointer and a 0 signifying a value.
> > > > - pfbits: indicates which double words are floats and pointers.
> Starting
> > > at
> > > > the highest bit:
> > > >     - a 0 indicates neither a pointer nor a float
> > > >     - a 10 indicates a float (double)
> > > >     - a 11 indicates a pointer
> > > >     - If noptr is set, each bit indicates a float. If nofp is set,
> each
> > > bit
> > > > indicates a pointer.
> > >
> > > There are 3 kinds of values:
> > >
> > > 1) pointers with bit 0 == 0
> > > 2) non-pointers with bit 0 == 1
> > > 3) floats with all bits used for the type (spanning 2 fields in 32bit)
> > >
> > > So if pfbits indicates a float then a field (or 2) is a float and all
> > > bits are used for the value. Otherwise the bit 0 of the field will
> > > tell you wether it is a pointer or not. So why would you want to
> > > duplicate that information in the pfbits?
> > >
> >
> > I was thinking of doing it for efficiency. If we're already indicating
> > what's what, we might as well represent shortcuts to the pointers, which
> > would cut down on the amount of reading, no? In the average case, the GC
> > would need to access a lot less memory.
> >
> >
> > > It might be nice to support C values like untagged ints or unaligned
> > > pointers. If Custom tag is set then the pfbits become ocaml value
> > > bits. The GC will only inspect fields with pfbit set. All other fields
> > > are ignored. The custom_operations handle compare, hash, serialize and
> > > deserialize so nothing else will access the data.
> > >
> > > Another thing are int32 and int64. I guess if you want to unbox those
> > > then having 2 bits per field in pfbits makes sense again. But then I
> > > would allocate them as:
> > >
> > >     - a 00 indicates a tagged value (int or pointer)
> > >     - a 01 indicates a non-pointer: int, int32, native int, C pointer
> > >     - a 10 indicates a float (double)
> > >     - a 11 indicates an int64
> > >
> > > The higher bit would indicate a 64bit value, meaning spanning 2 fields
> > > on 32bit. Not that those 4 values allow mixing ocaml values, C values,
> > > int32, int64 and float in a block.
> > >
> > > I would combine the noptr and nofp bits into a single 2bit field:
> > >
> > >     - a 00 indicates no pointers and no double size, no pfbits
> > >     - a 01 indicates no double size, pfbits indicate tagged /
> non-pointer
> > >     - a 10 indicates no pointers but double size, pfbits indicate size
> > >     - a 11 indicates both pointers and double size, 2 pfbits per field
> > >
> > > Note: tagged integers can be stored as 00 or 01. I think this would be
> > > required for polymorphic types. An 'a could be int or pointer. In both
> > > cases 00 will work.
> > >
> > >
> > I really like this idea -- unboxing more types could be really useful.
> I'm
> > not sure double 'size' would work, however. It should be fine for the
> > marshal module, but polymorphic comparison would get messed up because
> > floats have to be compared differently. So I think 10 in the bit field
> > should indicate no pointers but floats, while 11 could allow both
> pointers
> > and double size, with the 2-bits specifying if it's a float or an int64
> (as
> > you've outlined). Of course, one cannot have both shortcuts to pointers
> and
> > enhanced unboxing, so let me know what you think about the performance
> > increase from shortcutting the tag bit.
> >
> > Yotam
>
> Lets look at an example:
>
> type 'a r = { a:int; b:float; c:int32; d:int64; e:'a; }
>
> For 16-bit and 32-bit architectures:
>      +--------------------+----------+-------+------+
>      |     wosize         |pfbit type| color | tag  |
>      +--------------------+----------+-------+------+
> bits  31               20   19   18   17   16 15   0
>
> wosize = 7
> pfbit type = 11 (pointers and double size)
>
>      +------------------------------+--+--+--+--+--+
>      |                   pfbits     |00|11|01|10|01|
>      +------------------------------+--+--+--+--+--+
>                                       e  d  c  b  a
>
> The GC only needs to check e since 'a might be a pointer. All other fields
> are marked as non pointer.
>
> Comparison does a plain bit comparison on a, c and d, a float
> comparison on b and a tagged comparison on e. Similar for marshaling.
> There is no confusion between int64 and floats.
>
> MfG
>         Goswin
>

--047d7b33d37452d6ad04e8790f0f
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div><div><div>I had an idea based on (or actually copied =
from) something Goswin mentioned earlier in our discussion. What if the bit=
s can be used to indicate embedded values? An embedded value would have a h=
eader inside the body of the parent value, making it possible to get rid of=
 all pointers within that parent. This would represent a potentially large =
saving for the GC, as well as reducing random jumps around memory which are=
 very cache-stressing. <br>

</div><br></div><div>In the spirit of this idea, here is my latest version:=
<br><br>+ For 16-bit and 32-bit architectures:<br>=A0=A0=A0=A0 +-----------=
----+-------------+-----+-------+------+<br>=A0=A0=A0=A0 |=A0=A0=A0=A0=A0=
=A0 wosize=A0 |pfbits type=A0 |noptr| color | tag=A0 |<br>

=A0=A0=A0=A0 +---------------+-------------+-----+-------+------+<br>bits=
=A0 31=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 19=A0 18=A0=A0 17=A0=A0 16=A0=A0 15=A0=
 14=A0=A0 13 12=A0=A0 0<br><br>pfbits type:<br>- 000: no pfbits<br>- 001: p=
fbits indicate embedded header<br>- 010: pfbits indicate float<br>

- 011: pfbits indicate untagged type<br>- 100: pfbits indicate int64<br>- 1=
01: pfbits indicate float/embdedded header/untagged type, 2 bits per field<=
br>- 110: pfbits indicate float/untagged type/int64, 2 bits per field<br>

- 111: pfbits indicate embedded header/untagged type/int64, 2 bits per fiel=
d<br><br>- noptr: no pointers present<br>- if wosize =3D 0, the extension w=
ord is used for wosize<br>- if both wosize =3D 0 and the pfbits are used, t=
he wosize_large is first in memory<br>

<br>wosize_large word (if wosize is 0 in the header)<br>=A0=A0=A0=A0 +-----=
----------------------------------------+<br>=A0=A0=A0=A0 |=A0=A0=A0=A0=A0=
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 wosize=A0=A0=A0=A0=A0=A0=A0=A0=A0=
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 |<br>=A0=A0=A0=A0 +-------------------------=
--------------------+<br>

bits=A0 31=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 0<br><br>32 bi=
t pfbits word (present only if called for by pfbits type in header)<br>=A0=
=A0=A0=A0 +---------------------------------------------+<br>=A0=A0=A0=A0 |=
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 pfbits=A0=A0=A0=A0=
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 |<br>

=A0=A0=A0=A0 +---------------------------------------------+<br>bits=A0=A0 =
31=A0 30=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 0<br><br>- pfbits: <br>=A0=A0=A0=
 - If working in 1 bit mode, each bit represents whatever is signaled in th=
e pfbits type field.<br>

=A0=A0=A0 - If working in 2 bit mode, 00 always represents a regular word, =
and 01, 10, 11 represent their type as signaled in the pfbit type field.<br=
><br><br><br>+ For 64-bit architectures:<br><br>=A0=A0=A0=A0 +-------------=
+--------+----------+---+------+-------+------+<br>

=A0=A0=A0=A0 |=A0=A0=A0=A0 pfbits=A0 | wosize |pfbit type|exp| noptr| color=
 | tag=A0 |<br>=A0=A0=A0=A0 +-------------+--------+----------+---+------+-=
------+------+<br>bits=A0 63=A0=A0=A0=A0=A0=A0=A0=A0 40 39=A0=A0=A0 20 19=
=A0=A0=A0=A0=A0 17 16=A0=A0=A0 15=A0=A0 14=A0=A0 13 12=A0=A0 0<br><br><br>-=
 noptr: a structure with no pointers. <br>

- pfbits: a small pfbits field for smaller objects<br>- pfbits type: (sligh=
tly different than 32-bit architecture)<br>=A0=A0=A0 - 000: no pfbits, wosi=
ze includes pfbits as its upper bits<br>=A0=A0=A0 - 001: pfbits indicate em=
bedded header<br>

=A0=A0=A0 - 010: pfbits indicate float<br>=A0=A0=A0 - 011: pfbits indicate =
untagged type<br>=A0=A0=A0 - 100: pfbits indicate int64<br>=A0=A0=A0 - 101:=
 pfbits indicate float/untagged type/embedded header, 2 bits per field<br>=
=A0=A0=A0 - 110: pfbits indicate float/untagged type/int64, 2 bits per fiel=
d<br>

=A0=A0=A0 - 111: pfbits indicate int64/untagged type/embedded header, 2 bit=
s per field<br>- exp: use pfbits_expanded for signaling pfbits. Pfbits in h=
eader become top bits of wosize.<br><br><br>=A0=A0=A0=A0 +-----------------=
---------------------------------------+<br>

=A0=A0=A0=A0 |=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 pfbits_expanded=
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 |<=
br>=A0=A0=A0=A0 +--------------------------------------------------------+<=
br>bits=A0 63=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=
=A0=A0=A0=A0=A0=A0=A0 0<br><br>- pfbits_expanded: if exp is set, pfbits_exp=
anded takes the place of the pfbits. wo_size is joined with the pfbits in t=
he header.<br>

<br>+ Tags:<br><br>- 0: Array, record, tuple tag<br>- 1: Infix tag (must be=
 1 mod 4)<br>- 2: Closure tag<br>- 3: Lazy tag<br>- 4: Object tag<br>- 5: F=
orward tag<br>- 6: Abstract tag<br>- 7: String tag<br>- 8: Double tag<br>

- 9: Custom tag<br>- 10: Double_array tag<br>- 11: Proposed: Int32_array ta=
g<br>- 12: Proposed: Int64_array tag<br>- 13: Proposed: Cptr_array tag<br><=
/div><div>- 14: Proposed: float32_array tag<br></div><div>- 1000: first use=
r tag<br>

<br></div>-Yotam<br></div></div><div class=3D"gmail_extra"><br><br><div cla=
ss=3D"gmail_quote">On Tue, Oct 8, 2013 at 6:52 AM, Goswin von Brederlow <sp=
an dir=3D"ltr">&lt;<a href=3D"mailto:goswin-v-b@web.de" target=3D"_blank">g=
oswin-v-b@web.de</a>&gt;</span> wrote:<br>

<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex"><div class=3D"HOEnZb"><div class=3D"h5">On M=
on, Sep 30, 2013 at 11:31:23AM -0400, Yotam Barnoy wrote:<br>
&gt; On Mon, Sep 30, 2013 at 10:48 AM, Goswin von Brederlow &lt;<a href=3D"=
mailto:goswin-v-b@web.de">goswin-v-b@web.de</a>&gt;wrote:<br>
&gt;<br>
&gt; &gt; &gt;<br>
&gt; &gt; &gt; + For 16-bit and 32-bit architectures:<br>
&gt; &gt; &gt; =A0 =A0 =A0+---------------+----+----+-----+-------+------+<=
br>
&gt; &gt; &gt; =A0 =A0 =A0| =A0 =A0 wosize =A0 =A0| ext|cust|noptr| color |=
 tag =A0|<br>
&gt; &gt; &gt; =A0 =A0 =A0+---------------+----+----+-----+-------+------+<=
br>
&gt; &gt; &gt; bits =A031 =A0 =A0 =A0 =A0 =A0 21 =A020 =A0 19 =A0 18 =A0 17=
 =A0 16 15 =A0 0<br>
&gt; &gt; &gt;<br>
&gt; &gt; &gt; - noptr: no pointers present<br>
&gt; &gt; &gt; - ext: =A0uses extension word<br>
&gt; &gt; &gt; - cust(om): uses custom word. Custom word is normally used t=
o indicate<br>
&gt; &gt; &gt; floats and pointers.<br>
&gt; &gt; &gt;<br>
&gt; &gt; &gt; 32 bit extension word (present only if ext is 1)<br>
&gt; &gt; &gt; =A0 =A0 =A0+---------------------------------------------+<b=
r>
&gt; &gt; &gt; =A0 =A0 =A0| =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 wosize =A0 =
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0|<br>
&gt; &gt; &gt; =A0 =A0 =A0+---------------------------------------------+<b=
r>
&gt; &gt; &gt; bits =A031 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A00<br>
&gt; &gt;<br>
&gt; &gt; Why use a full bit for ext? I would define wosize =3D=3D 0 to mea=
n an<br>
&gt; &gt; extension word with the actual size is present. That way sizes up=
 to<br>
&gt; &gt; &lt;16KB can be encoded without extension word.<br>
&gt; &gt;<br>
&gt; &gt;<br>
&gt; Great point! Of course, that makes perfect sense. I was feeling like I=
 was<br>
&gt; wasting the wosize bits with the extension word but couldn&#39;t quite=
 get put<br>
&gt; 2 and 2 together.<br>
&gt; BTW, down the thread is a newer version of the design that reduces the=
 tag<br>
&gt; space to 8000 tags, which I do think is sufficient.<br>
&gt;<br>
&gt;<br>
&gt;<br>
&gt; &gt; =A0&gt; 32 bit custom word (default usage - present only if cust =
is 1):<br>
&gt; &gt; &gt; =A0 =A0 =A0+----+----------------------------------------+<b=
r>
&gt; &gt; &gt; =A0 =A0 =A0|nofp| =A0 =A0 =A0 =A0 =A0 =A0 =A0pfbits =A0 =A0 =
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0|<br>
&gt; &gt; &gt; =A0 =A0 =A0+----+----------------------------------------+<b=
r>
&gt; &gt; &gt; bits =A0 31 =A030 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 0<br>
&gt; &gt; &gt;<br>
&gt; &gt; &gt; - nofp: a structure with no floats. All pfbits are used for =
pointers,<br>
&gt; &gt; with<br>
&gt; &gt; &gt; a 1 signifying a pointer and a 0 signifying a value.<br>
&gt; &gt; &gt; - pfbits: indicates which double words are floats and pointe=
rs. Starting<br>
&gt; &gt; at<br>
&gt; &gt; &gt; the highest bit:<br>
&gt; &gt; &gt; =A0 =A0 - a 0 indicates neither a pointer nor a float<br>
&gt; &gt; &gt; =A0 =A0 - a 10 indicates a float (double)<br>
&gt; &gt; &gt; =A0 =A0 - a 11 indicates a pointer<br>
&gt; &gt; &gt; =A0 =A0 - If noptr is set, each bit indicates a float. If no=
fp is set, each<br>
&gt; &gt; bit<br>
&gt; &gt; &gt; indicates a pointer.<br>
&gt; &gt;<br>
&gt; &gt; There are 3 kinds of values:<br>
&gt; &gt;<br>
&gt; &gt; 1) pointers with bit 0 =3D=3D 0<br>
&gt; &gt; 2) non-pointers with bit 0 =3D=3D 1<br>
&gt; &gt; 3) floats with all bits used for the type (spanning 2 fields in 3=
2bit)<br>
&gt; &gt;<br>
&gt; &gt; So if pfbits indicates a float then a field (or 2) is a float and=
 all<br>
&gt; &gt; bits are used for the value. Otherwise the bit 0 of the field wil=
l<br>
&gt; &gt; tell you wether it is a pointer or not. So why would you want to<=
br>
&gt; &gt; duplicate that information in the pfbits?<br>
&gt; &gt;<br>
&gt;<br>
&gt; I was thinking of doing it for efficiency. If we&#39;re already indica=
ting<br>
&gt; what&#39;s what, we might as well represent shortcuts to the pointers,=
 which<br>
&gt; would cut down on the amount of reading, no? In the average case, the =
GC<br>
&gt; would need to access a lot less memory.<br>
&gt;<br>
&gt;<br>
&gt; &gt; It might be nice to support C values like untagged ints or unalig=
ned<br>
&gt; &gt; pointers. If Custom tag is set then the pfbits become ocaml value=
<br>
&gt; &gt; bits. The GC will only inspect fields with pfbit set. All other f=
ields<br>
&gt; &gt; are ignored. The custom_operations handle compare, hash, serializ=
e and<br>
&gt; &gt; deserialize so nothing else will access the data.<br>
&gt; &gt;<br>
&gt; &gt; Another thing are int32 and int64. I guess if you want to unbox t=
hose<br>
&gt; &gt; then having 2 bits per field in pfbits makes sense again. But the=
n I<br>
&gt; &gt; would allocate them as:<br>
&gt; &gt;<br>
&gt; &gt; =A0 =A0 - a 00 indicates a tagged value (int or pointer)<br>
&gt; &gt; =A0 =A0 - a 01 indicates a non-pointer: int, int32, native int, C=
 pointer<br>
&gt; &gt; =A0 =A0 - a 10 indicates a float (double)<br>
&gt; &gt; =A0 =A0 - a 11 indicates an int64<br>
&gt; &gt;<br>
&gt; &gt; The higher bit would indicate a 64bit value, meaning spanning 2 f=
ields<br>
&gt; &gt; on 32bit. Not that those 4 values allow mixing ocaml values, C va=
lues,<br>
&gt; &gt; int32, int64 and float in a block.<br>
&gt; &gt;<br>
&gt; &gt; I would combine the noptr and nofp bits into a single 2bit field:=
<br>
&gt; &gt;<br>
&gt; &gt; =A0 =A0 - a 00 indicates no pointers and no double size, no pfbit=
s<br>
&gt; &gt; =A0 =A0 - a 01 indicates no double size, pfbits indicate tagged /=
 non-pointer<br>
&gt; &gt; =A0 =A0 - a 10 indicates no pointers but double size, pfbits indi=
cate size<br>
&gt; &gt; =A0 =A0 - a 11 indicates both pointers and double size, 2 pfbits =
per field<br>
&gt; &gt;<br>
&gt; &gt; Note: tagged integers can be stored as 00 or 01. I think this wou=
ld be<br>
&gt; &gt; required for polymorphic types. An &#39;a could be int or pointer=
. In both<br>
&gt; &gt; cases 00 will work.<br>
&gt; &gt;<br>
&gt; &gt;<br>
&gt; I really like this idea -- unboxing more types could be really useful.=
 I&#39;m<br>
&gt; not sure double &#39;size&#39; would work, however. It should be fine =
for the<br>
&gt; marshal module, but polymorphic comparison would get messed up because=
<br>
&gt; floats have to be compared differently. So I think 10 in the bit field=
<br>
&gt; should indicate no pointers but floats, while 11 could allow both poin=
ters<br>
&gt; and double size, with the 2-bits specifying if it&#39;s a float or an =
int64 (as<br>
&gt; you&#39;ve outlined). Of course, one cannot have both shortcuts to poi=
nters and<br>
&gt; enhanced unboxing, so let me know what you think about the performance=
<br>
&gt; increase from shortcutting the tag bit.<br>
&gt;<br>
&gt; Yotam<br>
<br>
</div></div>Lets look at an example:<br>
<br>
type &#39;a r =3D { a:int; b:float; c:int32; d:int64; e:&#39;a; }<br>
<div class=3D"im"><br>
For 16-bit and 32-bit architectures:<br>
</div>=A0 =A0 =A0+--------------------+----------+-------+------+<br>
=A0 =A0 =A0| =A0 =A0 wosize =A0 =A0 =A0 =A0 |pfbit type| color | tag =A0|<b=
r>
=A0 =A0 =A0+--------------------+----------+-------+------+<br>
bits =A031 =A0 =A0 =A0 =A0 =A0 =A0 =A0 20 =A0 19 =A0 18 =A0 17 =A0 16 15 =
=A0 0<br>
<br>
wosize =3D 7<br>
pfbit type =3D 11 (pointers and double size)<br>
<br>
=A0 =A0 =A0+------------------------------+--+--+--+--+--+<br>
=A0 =A0 =A0| =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 pfbits =A0 =A0 |00|11|01|1=
0|01|<br>
=A0 =A0 =A0+------------------------------+--+--+--+--+--+<br>
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0=
 e =A0d =A0c =A0b =A0a<br>
<br>
The GC only needs to check e since &#39;a might be a pointer. All other fie=
lds<br>
are marked as non pointer.<br>
<br>
Comparison does a plain bit comparison on a, c and d, a float<br>
comparison on b and a tagged comparison on e. Similar for marshaling.<br>
There is no confusion between int64 and floats.<br>
<br>
MfG<br>
<span class=3D"HOEnZb"><font color=3D"#888888">=A0 =A0 =A0 =A0 Goswin<br>
</font></span></blockquote></div><br></div>

--047d7b33d37452d6ad04e8790f0f--