From: Xavier Leroy <Xavier.Leroy@inria.fr>
To: Chet Murthy <murthy.chet@gmail.com>
Cc: Frederic Perriot <fperriot@gmail.com>, caml-list <caml-list@inria.fr>
Subject: Re: [Caml-list] an implicit GC rule?
Date: Sat, 05 May 2018 07:42:03 +0000 [thread overview]
Message-ID: <CAH=h3gFdsCaNDOnF2oJFAYbOMEZihS-A7tMO5EiAnTaH0QwUjw@mail.gmail.com> (raw)
In-Reply-To: <CA++P_gcfkvcW33MOQtbU_yq_68F0miGhxgEdEW_ErStVSvdMvQ@mail.gmail.com>
[-- Attachment #1: Type: text/plain, Size: 5207 bytes --]
On Sat, May 5, 2018 at 5:25 AM Chet Murthy <murthy.chet@gmail.com> wrote:
> It's been a while since I did this sort of thing, but I suspect if you
> declare CAMLlocal variables for each intermediate expression, and stick in
> the assignments, that should solve your problem (while not making your code
> too ugly). E.g.
>
> CAMLprim value left_comb(value a, value b, value c)
> {
> CAMLparam3(a, b, c);
> CAMLlocal5(l1, l2, l3, l4, l5);
> CAMLreturn(l1 = Tree(l2 = Tree((l3 = Leaf(a)), (l4 = Leaf(b)), (l5 =
> Leaf(c))));
> }
>
That's bold C/C++ programming! It might even work in C++, where assignment
expressions are l-values if I remember correctly.
However, I'm afraid it won't work in C because an assignment expression "lv
= rv" is a r-value equal to the value of rv converted to the type of lv at
the time the assignment is evaluated. So, if lv is a local variable
registered with the GC, the GC will update lv when needed, but the "lv =
rv" expression will keep its initial value.
There's also the rules concerning sequence points. I think the code above
respects the C99 rules but I'm less sure about the C11 rules.
>
> Even better, you could linearize the tree of expressions into a sequence,
> and that should solve your problem, also.
>
Yes, that's the robust solution. Spelling it out:
CAMLprim value left_comb(value a, value b, value c)
{
CAMLparam3(a, b, c);
CAMLlocal5(la, lb, lc, tab, t);
la = Leaf(a);
lb = Leaf(b);
lc = Leaf(c);
tab = Tree(la, lb);
t = Tree(tab, lc);
CAMLreturn(t);
}
You can also do "CAMLreturn(Tree(tab, lc))" directly.
- Xavier Leroy
> Uh, I think. Been a while since I wrote a lotta C/C++ code to interface
> with Ocaml, but this oughta work.
>
> --chet--
>
>
> On Wed, May 2, 2018 at 9:09 AM, Frederic Perriot <fperriot@gmail.com>
> wrote:
>
>> Hello caml-list,
>>
>> I have a GC-related question. To give you some context, I'm writing a
>> tool to parse .cmi files and generate .h and .c files, to facilitate
>> constructing OCaml variants from C bindings.
>>
>> For instance, given the following source:
>>
>> type 'a tree = Leaf of 'a | Tree of 'a tree * 'a tree [@@h_file]
>>
>>
>> the tool produces C functions:
>>
>> CAMLprim value Leaf(value arg1)
>> {
>> CAMLparam1(arg1);
>> CAMLlocal1(obj);
>>
>> obj = caml_alloc_small(1, 0);
>>
>> Field(obj, 0) = arg1;
>>
>> CAMLreturn(obj);
>> }
>>
>> CAMLprim value Tree(value arg1, value arg2)
>> {
>> // similar code here
>> }
>>
>>
>> From there, it's tempting to nest calls to variant constructors from C
>> and write code such as:
>>
>> CAMLprim value left_comb(value a, value b, value c)
>> {
>> CAMLparam3(a, b, c);
>> CAMLreturn(Tree(Tree(Leaf(a), Leaf(b)), Leaf(c)));
>> }
>>
>>
>> The problem with the above is the GC root loss due to the nesting of
>> calls to allocating functions.
>>
>> Say Leaf(c) is constructed first, and the resulting value cached in a
>> register, then Leaf(b) triggers a collection, thus invalidating the
>> register contents, and leaving a dangling pointer in the top Tree.
>>
>> Here is an actual ocamlopt output, with Leaf(c) getting cached in rbx:
>>
>> 0x000000000040dbf4 <+149>: callq 0x40d8fd <Leaf>
>> 0x000000000040dbf9 <+154>: mov %rax,%rbx
>> 0x000000000040dbfc <+157>: mov -0x90(%rbp),%rax
>> 0x000000000040dc03 <+164>: mov %rax,%rdi
>> 0x000000000040dc06 <+167>: callq 0x40d8fd <Leaf>
>> 0x000000000040dc0b <+172>: mov %rax,%r12
>> 0x000000000040dc0e <+175>: mov -0x88(%rbp),%rax
>> 0x000000000040dc15 <+182>: mov %rax,%rdi
>> 0x000000000040dc18 <+185>: callq 0x40d8fd <Leaf>
>> 0x000000000040dc1d <+190>: mov %r12,%rsi
>> 0x000000000040dc20 <+193>: mov %rax,%rdi
>> 0x000000000040dc23 <+196>: callq 0x40da19 <Tree>
>> 0x000000000040dc28 <+201>: mov %rbx,%rsi
>> 0x000000000040dc2b <+204>: mov %rax,%rdi
>> 0x000000000040dc2e <+207>: callq 0x40da19 <Tree>
>>
>>
>> While the C code clearly violates the spirit of the GC rules, I can't
>> help but feel this is still a pitfall.
>>
>> Rule 2 of the manual states: "Local variables of type value must be
>> declared with one of the CAMLlocal macros. [...]"
>>
>> But here, I'm not declaring local variables, unless you count compiler
>> temporaries as local variables?
>>
>> I can see some other people making the same mistake I did. Should
>> there be an explicit warning in the rules? maybe underlining that
>> compiler temps count as variables, or discouraging the kind of nested
>> calls returning values displayed above?
>>
>> thanks,
>> Frédéric Perriot
>>
>> PS: this is also my first time posting to the list, so I take this
>> opportunity to thank you for the great Q's and A's I've read here over
>> the years
>>
>> --
>> Caml-list mailing list. Subscription management and archives:
>> https://sympa.inria.fr/sympa/arc/caml-list
>> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
>> Bug reports: http://caml.inria.fr/bin/caml-bugs
>
>
>
[-- Attachment #2: Type: text/html, Size: 7293 bytes --]
next prev parent reply other threads:[~2018-05-05 7:42 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-05-02 16:09 Frederic Perriot
2018-05-05 3:24 ` Chet Murthy
2018-05-05 7:42 ` Xavier Leroy [this message]
2018-05-05 14:11 ` [Caml-list] [ANN] Release 2.8.5 of Caph, a functional/dataflow language for programming FPGAs Jocelyn Sérot
2018-05-06 19:23 ` [Caml-list] an implicit GC rule? Chet Murthy
2018-05-07 17:01 ` Frederic Perriot
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAH=h3gFdsCaNDOnF2oJFAYbOMEZihS-A7tMO5EiAnTaH0QwUjw@mail.gmail.com' \
--to=xavier.leroy@inria.fr \
--cc=caml-list@inria.fr \
--cc=fperriot@gmail.com \
--cc=murthy.chet@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox