* Bigarrays and temporar C pointers
@ 2005-01-15 23:40 Daniel Bünzli
2005-01-16 2:28 ` [Caml-list] " John Prevost
0 siblings, 1 reply; 3+ messages in thread
From: Daniel Bünzli @ 2005-01-15 23:40 UTC (permalink / raw)
To: caml-list caml-list
Hello,
Suppose that I have a C library which allows me to access data via a
temporar pointer with the following interface :
> void *map(void); /* Returns a valid pointer to data */
> int map_size(void); /* Returns the size of the data in bytes. */
> void unmap(void); /* Invalidates the last pointer returned by map. */
Mapping data to a pointer must only be done for a short period of time:
map, get size, process data, and unmap.
I would like to be able to process data in ocaml with bigarrays. To do
so I provide the ocaml function `map'. This function maps the pointer,
pass it as a bigarray to a user callback to process the data and then
unmaps the pointer.
> open Bigarray;;
>
> type ('a, 'b) data = ('a, 'b, c_layout) Array1.t
>
> val map : ('a, b) kind -> (('a, 'b) data -> unit) -> unit
Map is implemented as follow (C primitives are at the end of the mail),
> external _map_ptr : ('a, 'b) kind -> ('a, 'b) data = "stub_map_ptr"
> external _unmap_ptr : ('a, 'b) data -> unit = "stub_unmap_ptr"
>
> let map k f =
> let a = _map_ptr k in
> f a;
> _unmap_ptr a
My problem is that the provided bigarray may escape the scope of the
user callback (e.g. by setting a global reference to the bigarray)
potentially allowing the user to access data at an invalid pointer
position after the pointer was invalidated.
In fact for the bigarray itself it is not a problem, I set its
dimension to zero when I unmap it in _unmap_ptr (see the C
implementation below) so access outside the user callback raise
exceptions. However, according to my experiments and wandering in the
implementation of bigarray this doesn't work if the user extracts a
subarray with Array1.sub and sets it to a global variable.
Is there a solution to make that completely safe or I can only warn the
user that he should not to try to escape data from the callback ?
Thanks for your help,
Daniel
The implementation of the C primitives :
> extern int bigarray_element_size[]; /* bigarray_stubs.c */
>
> CAMLprim value stub_map_ptr (value kind)
> {
> void *p = map ();
> long dim = map_size () / bigarray_element_size[Int_val (kind)];
> int flag = Int_val (kind) | BIGARRAY_C_LAYOUT | BIGARRAY_EXTERNAL;
> return alloc_bigarray(flag, 1, p, &dim);
> }
>
> CAMLprim value stub_unmap_ptr (value b)
> {
> struct caml_bigarray *arr = Bigarray_val(b);
> arr->data = NULL;
> arr->dim[0] = 0;
> unmap();
> return Val_unit;
> }
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [Caml-list] Bigarrays and temporar C pointers
2005-01-15 23:40 Bigarrays and temporar C pointers Daniel Bünzli
@ 2005-01-16 2:28 ` John Prevost
2005-01-16 12:31 ` Daniel Bünzli
0 siblings, 1 reply; 3+ messages in thread
From: John Prevost @ 2005-01-16 2:28 UTC (permalink / raw)
To: Daniel Bünzli; +Cc: caml-list caml-list
Well, assuming you really need to work in this strange way, I have a
couple of thoughts how to do it. Note that it's going to be rather
unsound to work with this in any case, but at least you will get
exceptions instead of core dumps or worse. The heart of the matter is
that you should *not* allow the user to manipulate the data array
directly.
module Scary_map_thingy_1 =
(struct
exception Scary_map_unmapped
exception Scary_map_conflict
type ('a, 'b) t = ('a, 'b, c_layout) Bigarray.Array1.t ref
let dim a = match a with None -> raise Scary_map_unmapped
| Some a' -> Array1.dim a'
(* replicate other functionality from Array1 below *)
let already_mapped = ref false
let map k f =
if !already_mapped then raise Scary_map_conflict else
let a = ref (Some (_map_ptr k)) in
try
already_mapped := true;
f a;
_unmap_ptr k;
a := None;
already_mapped := false
with exn -> begin
_unmap_ptr k;
a := None;
already_mapped := false;
raise exn
end
end : sig
type ('a, 'b) t
exception Scary_map_unmapped (* It escaped scope *)
exception Scary_map_conflict (* Tried to map inside map *)
val dim : ('a, 'b) t -> int
(* rest of replicated API *)
val map : ('a, 'b) kind -> (('a, 'b) t -> unit) -> unit
end)
So the approach here is to wrap the value in such a way that it
doesn't matter if it escape the scope. This is not really any better
than what you have now. You still have to warn the user *not* to
allow it to escape scope, since it won't work. But the benefit is
that it is guaranteed to fail if the user tries it.
The second approach is to prevent the user from accessing the data directly:
module Scary_map_thingy_2 =
(struct
exception Scary_map_conflict
let already_mapped = ref false
let map k f =
if !already_mapped then raise Scary_map_conflict else
try
let a = _map_ptr k in begin
already_mapped := true;
for i = 0 to Array1.dim a do
f a.{i}
done;
_unmap_ptr a;
already_mapped := false
end
with exn -> begin
_unmap_ptr k;
already_mapped := false;
raise exn
end
end : sig
exception Scary_map_conflict (* Tried to map inside map *)
val map : ('a, 'b) kind -> (int -> 'a -> unit) -> unit
end)
In this second case, the approach is to prevent the caller from ever
getting a handle on the actual data array. Instead, the map is made,
the caller is handed every (index, value) pair from the array in turn,
and then the map is unmade. This is much more restrictive, but also
much safer.
Finally, this kind of approach might be best if mapping and unmapping
is not particularly expensive, and you trust the user to act better:
module Scary_map_thing =
(struct
val data = (ref None : (some, specific, c_layout) Array1.t ref)
val hold_count = ref 0
let hold () =
begin
incr hold_count;
match !data with
| Some _ -> ()
| None -> data := _map_ptr some_specific_kind
end
let unhold () =
begin
decr holding;
match !holding with
| 0 -> _unmap_ptr some_specific_kind
| _ -> ()
end
let work f =
begin
hold ();
try
let result = f () in
unhold ();
result
with exn -> (unhold (); raise exn)
end
let dim () = work (fun () -> Array1.dim !data)
(* rest of modified Array1 calls here *)
end : sig
val work : (unit -> 'a) -> 'a
val dim : unit -> int
(* rest of modified Array1 calls *)
end)
In this last approach, instead of wrapping that array up in a data
structure, we wrap it up in a module. The module either has a
currently mapped copy of the data, or it doesn't. If you call
Scary_map_thing.dim (), you get the dimensions of the data, no matter
what. If the data was unmapped when you called dim, it is mapped, the
value is gotten, then it is unmapped. If you have a *lot* of work to
do and wish to avoid mapping and unmapping constantly, you can wrap
your function up in work like this:
let myfunc () =
Scary_map_thing.work (fun () ->
for i = 0 to Scary_map_thing.dim () do
Scary_map_thing.set i (Scary_map_thing.get i + 1)
done)
Which will map it once, then use it a lot, then unmap it at the end.
This version also doesn't throw up if you call a function that tries
to map while mapped--it just increments a counter.
This third solution may be the best one over all--especially because
there are a number of ways it can be improved. For example:
* If you need to be able to map as multiple different kinds, then
you can provide more useful state in the code, so that you track
what kind it is currently mapped as, and adjust the mapping as
needed in order to work safely. In this case, Scary_map_thing.dim
would have the type ('a, 'b) kind -> int, and likewise all of the
modified Array1 calls would take kinds instead of unit or actual
arrays.
* If you want to avoid mapping and unmapping in a more general way, and
use threads, you could start up a worker thread in the module that
keeps track of the last time the data was used, and unmaps it if
it hasn't been used in a certain amount of time.
Finally, please note that none of the skeletal solutions I describe
above are thread-safe. If more than one thread can be working at a
time (like with the unmap-on-timeout extension), you need to be more
careful about modifying the internal state. Note that I think
solution 3 is the only one that can cleanly handle threads at all,
since it's the only one that can handle multiple people wanting to
work with the data all at once.
Hope these ideas were useful,
John.
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [Caml-list] Bigarrays and temporar C pointers
2005-01-16 2:28 ` [Caml-list] " John Prevost
@ 2005-01-16 12:31 ` Daniel Bünzli
0 siblings, 0 replies; 3+ messages in thread
From: Daniel Bünzli @ 2005-01-16 12:31 UTC (permalink / raw)
To: John Prevost; +Cc: caml-list caml-list
Le 16 janv. 05, à 03:28, John Prevost a écrit :
> Well, assuming you really need to work in this strange way,
Yes, some part of opengl work like this, vertex buffer objects [1]. In
fact there a lot more things that are not allowed to do while a buffer
is mapped and it is not possible to enforce every constraints (however
most, if not all, of these errors just lead to gl errors, not to core
dumps).
Anyway, thanks for you time and code. Especially for the handling of
exceptions occuring in the callback which I completely forgot.
In fact I didn't consider, as you suggest, to make the type
> type ('a, 'b) data = ('a, 'b, c_layout) Array1.t
abstract and replicate Array1.t's "allowed" functionnality in the
module --- I hope that it won't prevent the optimisations present in
the compiler for bigarrays.
> set = Array1.set
> get = Array1.get
> ...
The only problem I see is that the user loses the ability to use
existing bigarray code and the lighter syntax to access/write the
array. On the other hand I can prevent the user from extracting
subarrays and I'm on the safe side again.
There's a tradeoff and I cannot make up my mind right now.
Daniel
[1]
<http://oss.sgi.com/projects/ogl-sample/registry/ARB/
vertex_buffer_object.txt>
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2005-01-16 12:32 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-01-15 23:40 Bigarrays and temporar C pointers Daniel Bünzli
2005-01-16 2:28 ` [Caml-list] " John Prevost
2005-01-16 12:31 ` Daniel Bünzli
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox