From: Chet Murthy <murthy.chet@gmail.com>
To: Xavier Leroy <Xavier.Leroy@inria.fr>
Cc: Frederic Perriot <fperriot@gmail.com>, caml-list <caml-list@inria.fr>
Subject: Re: [Caml-list] an implicit GC rule?
Date: Sun, 6 May 2018 12:23:41 -0700 [thread overview]
Message-ID: <CA++P_gcLtW+CFpmxnOS79OSqen5nSNKJ-ueRrZm4oezaCEzf2Q@mail.gmail.com> (raw)
In-Reply-To: <CAH=h3gFdsCaNDOnF2oJFAYbOMEZihS-A7tMO5EiAnTaH0QwUjw@mail.gmail.com>
[-- Attachment #1: Type: text/plain, Size: 5860 bytes --]
Oh shit, and also, uh, oh right! I forgot that (for example in Tree((v1 =
e1), (v2 = e2)) it could transpire that after evaluating e1 and assigning
to v1, the evaluation of e2 could end up moving the value pointed-to by v1,
updaiting v1, but NOT updating the result of the expression (v1 = e1) (b/c
of course, it's evaluated and on-stack waiting for the call to Tree().
Oof.
On Sat, May 5, 2018 at 12:42 AM, Xavier Leroy <Xavier.Leroy@inria.fr> wrote:
>
>
> On Sat, May 5, 2018 at 5:25 AM Chet Murthy <murthy.chet@gmail.com> wrote:
>
>> It's been a while since I did this sort of thing, but I suspect if you
>> declare CAMLlocal variables for each intermediate expression, and stick in
>> the assignments, that should solve your problem (while not making your code
>> too ugly). E.g.
>>
>> CAMLprim value left_comb(value a, value b, value c)
>> {
>> CAMLparam3(a, b, c);
>> CAMLlocal5(l1, l2, l3, l4, l5);
>> CAMLreturn(l1 = Tree(l2 = Tree((l3 = Leaf(a)), (l4 = Leaf(b)), (l5 =
>> Leaf(c))));
>> }
>>
>
> That's bold C/C++ programming! It might even work in C++, where
> assignment expressions are l-values if I remember correctly.
>
> However, I'm afraid it won't work in C because an assignment expression
> "lv = rv" is a r-value equal to the value of rv converted to the type of lv
> at the time the assignment is evaluated. So, if lv is a local variable
> registered with the GC, the GC will update lv when needed, but the "lv =
> rv" expression will keep its initial value.
>
> There's also the rules concerning sequence points. I think the code above
> respects the C99 rules but I'm less sure about the C11 rules.
>
>>
>> Even better, you could linearize the tree of expressions into a sequence,
>> and that should solve your problem, also.
>>
>
> Yes, that's the robust solution. Spelling it out:
>
> CAMLprim value left_comb(value a, value b, value c)
> {
> CAMLparam3(a, b, c);
> CAMLlocal5(la, lb, lc, tab, t);
> la = Leaf(a);
> lb = Leaf(b);
> lc = Leaf(c);
> tab = Tree(la, lb);
> t = Tree(tab, lc);
> CAMLreturn(t);
> }
>
> You can also do "CAMLreturn(Tree(tab, lc))" directly.
>
> - Xavier Leroy
>
>
>> Uh, I think. Been a while since I wrote a lotta C/C++ code to interface
>> with Ocaml, but this oughta work.
>>
>> --chet--
>>
>>
>> On Wed, May 2, 2018 at 9:09 AM, Frederic Perriot <fperriot@gmail.com>
>> wrote:
>>
>>> Hello caml-list,
>>>
>>> I have a GC-related question. To give you some context, I'm writing a
>>> tool to parse .cmi files and generate .h and .c files, to facilitate
>>> constructing OCaml variants from C bindings.
>>>
>>> For instance, given the following source:
>>>
>>> type 'a tree = Leaf of 'a | Tree of 'a tree * 'a tree [@@h_file]
>>>
>>>
>>> the tool produces C functions:
>>>
>>> CAMLprim value Leaf(value arg1)
>>> {
>>> CAMLparam1(arg1);
>>> CAMLlocal1(obj);
>>>
>>> obj = caml_alloc_small(1, 0);
>>>
>>> Field(obj, 0) = arg1;
>>>
>>> CAMLreturn(obj);
>>> }
>>>
>>> CAMLprim value Tree(value arg1, value arg2)
>>> {
>>> // similar code here
>>> }
>>>
>>>
>>> From there, it's tempting to nest calls to variant constructors from C
>>> and write code such as:
>>>
>>> CAMLprim value left_comb(value a, value b, value c)
>>> {
>>> CAMLparam3(a, b, c);
>>> CAMLreturn(Tree(Tree(Leaf(a), Leaf(b)), Leaf(c)));
>>> }
>>>
>>>
>>> The problem with the above is the GC root loss due to the nesting of
>>> calls to allocating functions.
>>>
>>> Say Leaf(c) is constructed first, and the resulting value cached in a
>>> register, then Leaf(b) triggers a collection, thus invalidating the
>>> register contents, and leaving a dangling pointer in the top Tree.
>>>
>>> Here is an actual ocamlopt output, with Leaf(c) getting cached in rbx:
>>>
>>> 0x000000000040dbf4 <+149>: callq 0x40d8fd <Leaf>
>>> 0x000000000040dbf9 <+154>: mov %rax,%rbx
>>> 0x000000000040dbfc <+157>: mov -0x90(%rbp),%rax
>>> 0x000000000040dc03 <+164>: mov %rax,%rdi
>>> 0x000000000040dc06 <+167>: callq 0x40d8fd <Leaf>
>>> 0x000000000040dc0b <+172>: mov %rax,%r12
>>> 0x000000000040dc0e <+175>: mov -0x88(%rbp),%rax
>>> 0x000000000040dc15 <+182>: mov %rax,%rdi
>>> 0x000000000040dc18 <+185>: callq 0x40d8fd <Leaf>
>>> 0x000000000040dc1d <+190>: mov %r12,%rsi
>>> 0x000000000040dc20 <+193>: mov %rax,%rdi
>>> 0x000000000040dc23 <+196>: callq 0x40da19 <Tree>
>>> 0x000000000040dc28 <+201>: mov %rbx,%rsi
>>> 0x000000000040dc2b <+204>: mov %rax,%rdi
>>> 0x000000000040dc2e <+207>: callq 0x40da19 <Tree>
>>>
>>>
>>> While the C code clearly violates the spirit of the GC rules, I can't
>>> help but feel this is still a pitfall.
>>>
>>> Rule 2 of the manual states: "Local variables of type value must be
>>> declared with one of the CAMLlocal macros. [...]"
>>>
>>> But here, I'm not declaring local variables, unless you count compiler
>>> temporaries as local variables?
>>>
>>> I can see some other people making the same mistake I did. Should
>>> there be an explicit warning in the rules? maybe underlining that
>>> compiler temps count as variables, or discouraging the kind of nested
>>> calls returning values displayed above?
>>>
>>> thanks,
>>> Frédéric Perriot
>>>
>>> PS: this is also my first time posting to the list, so I take this
>>> opportunity to thank you for the great Q's and A's I've read here over
>>> the years
>>>
>>> --
>>> Caml-list mailing list. Subscription management and archives:
>>> https://sympa.inria.fr/sympa/arc/caml-list
>>> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
>>> Bug reports: http://caml.inria.fr/bin/caml-bugs
>>
>>
>>
[-- Attachment #2: Type: text/html, Size: 8340 bytes --]
next prev parent reply other threads:[~2018-05-06 19:23 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-05-02 16:09 Frederic Perriot
2018-05-05 3:24 ` Chet Murthy
2018-05-05 7:42 ` Xavier Leroy
2018-05-05 14:11 ` [Caml-list] [ANN] Release 2.8.5 of Caph, a functional/dataflow language for programming FPGAs Jocelyn Sérot
2018-05-06 19:23 ` Chet Murthy [this message]
2018-05-07 17:01 ` [Caml-list] an implicit GC rule? Frederic Perriot
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CA++P_gcLtW+CFpmxnOS79OSqen5nSNKJ-ueRrZm4oezaCEzf2Q@mail.gmail.com \
--to=murthy.chet@gmail.com \
--cc=Xavier.Leroy@inria.fr \
--cc=caml-list@inria.fr \
--cc=fperriot@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox