From: Malcolm Matalka <mmatalka@gmail.com>
To: Jeremy Yallop <yallop@gmail.com>
Cc: "David Sheets" <sheets@alum.mit.edu>,
"Jeremie Dimino" <jdimino@janestreet.com>,
"Christoph Höger" <christoph.hoeger@tu-berlin.de>,
"caml users" <caml-list@inria.fr>
Subject: Re: [Caml-list] Save callbacks from OCaml to C
Date: Thu, 04 Feb 2016 07:26:06 +0000 [thread overview]
Message-ID: <86io25q6lt.fsf@gmail.com> (raw)
In-Reply-To: <CAAxsn=H5OSbFmp=KB9QHXEuzHfXcUiNDmkD8=p7wUBXsR0HDeQ@mail.gmail.com> (Jeremy Yallop's message of "Wed, 3 Feb 2016 16:14:55 -0800")
Jeremy Yallop <yallop@gmail.com> writes:
> On 3 February 2016 at 12:15, Malcolm Matalka <mmatalka@gmail.com> wrote:
>> Jeremy Yallop <yallop@gmail.com> writes:
>>
>>> On 3 February 2016 at 05:44, David Sheets <sheets@alum.mit.edu> wrote:
>>>> On Wed, Feb 3, 2016 at 12:26 PM, Malcolm Matalka <mmatalka@gmail.com> wrote:
>>>>> Jeremie Dimino <jdimino@janestreet.com> writes:
>>>>>> You need to register [ml_t], [ml_x] and [ml_g
>>>>>> ] as GC roots. Otherwise if the GC runs in caml_ba_alloc for instance,
>>>>>> [ml_t] might ends up containing garbage even before reaching
>>>>>> [caml_callback3]. You can use the normal macros for that:
>>>>>>
>>>>> If one is using ctypes, is all of this taken care of? I have a library
>>>>> that registers a bunch of Ocaml functions in C code, which the C code
>>>>> calls. I haven't experienced anything bad happening yet, but that
>>>>> doesn't mean much...
>>>>
>>>> If you use ctypes and pass OCaml closures to C, you *must* retain a
>>>> reference to the closure to avoid it being GCed. If you do not, you
>>>> may experience the exception CallToExpiredClosure sporadically.
>>>
>>> Besides David's caveat, the answer is yes: ctypes will take care of
>>> registering arguments as GC roots as necessary.
>>
>> Can you clarify this a bit? I'm not that familiar with how the C FFI
>> works. If I pass in a closure to a C function and it is registered as a
>> GC root, doesn't that mean it won't be GCd if my Ocaml program forgets
>> about it or?
>
> That's how roots behave, yes: while a value is registered as a root,
> the value won't be collected. There are (roughly speaking) two types
> of root in OCaml: local roots, which persist for the duration of a
> function call, and global roots, which persist until explicitly
> released. A C function binding written by hand must ensure that OCaml
> values passed to it as arguments are registered as local roots, so
> that if a collection occurs while the function is running the values
> won't be prematurely collected.
>
> A C binding written using ctypes can generally ignore the matter of
> roots. That's partly because ctypes takes care of root registration,
> but also because most types passed between OCaml and C in a ctypes
> binding are C values, not OCaml values. For example, if you want to
> pass a structure with several fields between OCaml and C there are two
> approaches. One approach is to represent the structure as an OCaml
> record, which involves accessing the fields of the value in your C
> binding using various macros, taking care to register values as roots
> to protect them from the GC. The other approach is to represent the
> structure as a C struct, which involves accessing its fields in OCaml
> using the functions ctypes provides. (If you enjoy programming in an
> untyped dialect of C with ubiquitous concurrency, you'll probably
> favour the first approach. If you prefer programming in OCaml then
> the second approach might have some appeal.)
>
> Using the C value representation for values that cross the C-OCaml
> boundary generally works well, but when things become higher-order,
> the situation changes a bit. When a C library expects to be given a
> first-order value such as a struct we have to give it a struct with
> the appropriate layout, since C functions can directly access the
> representation of values. However, when the library expects a
> function pointer we have a bit more freedom, since the representation
> of functions isn't accessible -- in fact, the only thing that can be
> done with a function pointer, besides passing it from place to place,
> is calling it. This freedom means that we can pass an OCaml function,
> suitably packaged up, where a C function pointer is expected.
>
> Passing OCaml functions to C as function pointers raises some
> interesting issues relating to object lifetime and the garbage
> collector. The main difficulty arises from the fact that once you
> pass a function pointer to a C library there's no way of knowing how
> long the library holds on to it: for example, the library might
> discard the function pointer when the call into the library returns,
> or it might store the function pointer in a global variable to be
> retrieved and called later. In order to prevent the associated
> function from being collected prematurely, some kind of action is
> needed on the OCaml side, whether registering a global root, or
> ensuring that the function is reachable from the OCaml program.
>
>> Also, David and I were talking about how to solve this on IRC. In my
>> specific case, callbacks are one-shot, which means I know they need to
>> be remembered until they are called then they can (possibly) be freed.
>> Is there a nice solution here? I'd prefer not to store them in some
>> other data structure and remove them later just to keep a reference
>> alive, if possible.
>
> Storing some kind of references to the functions in a place that the
> collector can see is essential to prevent the functions from being
> collected prematurely. The situation is the same whether you use
> ctypes or write bindings by hand.
>
> Storing the functions in a table, and removing them automatically
> after they're called is one approach. An alternative is to use the
> new Ctypes.Roots module, which will be available in the next release:
>
> https://github.com/ocamllabs/ocaml-ctypes/blob/182a9e64src/ctypes/ctypes.mli#L419-L435
Thank you for the thorough response. It seems like Ctypes.Roots might
solve my problem, although the URL gives me a 404. Do you have an
estimation of when this will be released (or anything someone like
myself can do to help?)
>
>> That is overhead I'd prefer to avoid, if possible.
>> I plan on having possibly hundreds of thousands of these callbacks alive
>> at any point in time.
>
> In that case it sounds like there'll be an overhead of up to a few megabytes.
Any suggestions for a datatype to use here? I do have an object that is
long lived that represents the event loop I'm integrating against, so I
can store anything I want in there. Last night I was really concerned
about storing this extra information in the loop, just seemed like a
waste, but in the morning light I'm less worried about it. I could just
use a Hashtbl I guess with some reference to the closure. My current
idea is to make some integer value and wrap the closure up in something
like:
(fun () -> Hashtbl.remove t id; closure ())
What kind of sucks about that is the wrapper needs to be unique to each
type of closure that gets called, there doesn't seem like a really
generic way to do this wrapping. Am I on the wrong track?
Thanks again,
/Malcolm
next prev parent reply other threads:[~2016-02-04 7:26 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-02-03 10:54 Christoph Höger
2016-02-03 11:48 ` Jeremie Dimino
2016-02-03 12:26 ` Malcolm Matalka
2016-02-03 13:44 ` David Sheets
2016-02-03 18:02 ` Jeremy Yallop
2016-02-03 20:15 ` Malcolm Matalka
2016-02-04 0:14 ` Jeremy Yallop
2016-02-04 7:26 ` Malcolm Matalka [this message]
2016-02-04 19:29 ` Jeremy Yallop
[not found] ` <56B1EC33.2090303@tu-berlin.de>
2016-02-03 13:49 ` Jeremie Dimino
2016-02-03 14:38 ` Christoph Höger
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=86io25q6lt.fsf@gmail.com \
--to=mmatalka@gmail.com \
--cc=caml-list@inria.fr \
--cc=christoph.hoeger@tu-berlin.de \
--cc=jdimino@janestreet.com \
--cc=sheets@alum.mit.edu \
--cc=yallop@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox