* [Caml-list] gc question: thread stacks, fibers, etc.
@ 2002-10-04 10:25 Chris Hecker
2002-10-06 18:13 ` [Caml-list] Coroutines Jerome Vouillon
` (2 more replies)
0 siblings, 3 replies; 8+ messages in thread
From: Chris Hecker @ 2002-10-04 10:25 UTC (permalink / raw)
To: caml-list; +Cc: buzzard
I'm looking at implementing fibers/coroutines/cooperative-threads
(described below for reference) in ocaml, and I've run into a small issue,
a question, and a couple confusions.
First, the issue: Because they're cooperative, there's no way a fiber can
run unless somebody tells it to do so using its handle/Fiber.t. This means
that if a fiber's handle goes out of scope, the entire fiber, including its
stack, is garbage and should be able to be collected (unlike a thread,
where even if the Thread.t is gone, the thread is still
running/scheduled). However, I can't figure out if it's possible to
implement fibers so they work like this.
In more detail: a fiber is going to need a stack, as mentioned
below. Following the systhreads code (win32.c & posix.c), I'll allocate
the stack outside the caml heap. This stack will get scanned by the gc
using the scan_roots_hook from C, just like in the systhreads code. The
problem is, this design treats the stack as a root, so the stack itself
can't be gc'd, even if it's orphaned.
Is there a way to make the stack just another heap object? I could easily
put the pointer to the stack in an Abstract_tag block, but then the stack
won't get scanned by the gc at all, which is bad. It seems like what I
want is another kind of block or another function in the custom_operations
for "let me manually scan the opaque data in this block, which itself
points to data that contains values, and if it's all garbage, finalize me".
Does that make sense? The behavior I want is to not have to keep track of
the fibers if I don't care if they get gc'd, just like other objects.
My next question is a clarification of the way caml works: the stack can
contain values, but never blocks, right? In C terms, it can contain
pointers but not actual objects that other people can point to? All actual
boxed data is on the heap, right? So, I can just delete a fiber's stack
and it'll work? Put it yet another explicit way, the reason you have to
register roots when you work with values on the stack in C is because you
don't want the gc to free things you're pointing to, not because of any
funkiness of others pointing into your stack.
Finally, a couple confusions:
- final_fun in caml_thread_handle in win32.c is not used?
- HANDLE in caml_thread_handle in win32 seems odd, since it's scanned by
the gc, yet it's a windows handle. who's to say that it won't randomly end
up as a value that looks like a pointer into one of the caml heaps?
Chris
Fibers/Coroutines/Cooperative-Threads
You can read a bunch about these on the net. Basically, they're
lightweight nonpreemtive threads. They allow you to yield back to the
creator (or a general scheduler) anywhere in the fiber with an explicit
yield(), and when the fiber is resumed it returns from the yield and
continues on. They're very useful for breaking up long computations where
real threading is too complex to be worthwhile, but where breaking up the
computation into chunks is a mess (and generates a heinous FSM). Game AIs
are a good example, network protocol negotiations are another. In C, you
implement them by mallocing a stack and with a little snippet of asm, or
mucking with the jmpbuf fields. They're pretty trivial to get working
fairly robustly (how you grow the stack is an issue). I guess they're sort
of related to continuations (or at least, you can implement coroutines if
your language supports continuations), but they don't seem to be as heavy-duty.
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Caml-list] Coroutines
2002-10-04 10:25 [Caml-list] gc question: thread stacks, fibers, etc Chris Hecker
@ 2002-10-06 18:13 ` Jerome Vouillon
2002-10-06 19:46 ` Chris Hecker
2002-10-12 15:58 ` John Max Skaller
2002-10-12 16:33 ` [Caml-list] gc question: thread stacks, fibers, etc John Max Skaller
2002-10-13 8:32 ` Xavier Leroy
2 siblings, 2 replies; 8+ messages in thread
From: Jerome Vouillon @ 2002-10-06 18:13 UTC (permalink / raw)
To: Chris Hecker; +Cc: caml-list
On Fri, Oct 04, 2002 at 03:25:44AM -0700, Chris Hecker wrote:
> I'm looking at implementing fibers/coroutines/cooperative-threads
> (described below for reference) in ocaml, and I've run into a small issue,
> a question, and a couple confusions.
I just spent a few hours implementing a small coroutine library. It
is fully written in Ocaml. Below is a quick description. Would it
satisfy your needs ? I can send you a copy, or make it available on the
Web if you like.
-- Jérôme
A coroutine returning values of type 'a has type 'a output.
type 'a output
You can spawn it. You then get an handle of type 'a input
type 'a input
val spawn : 'a output -> 'a input
The caller can get a value from the coroutine.
val receive : 'a input -> 'a
A coroutine can return a value of type 'a.
val send : 'a -> 'a output
It can also exit. If a caller try to get more values from the
coroutine afterwards, an exception Exited will be raised.
exception Exited
val exit : 'a output
Coroutines can be combined sequentially: "e >> fun () -> e'" is a
routine that behaves first as "e", then as "e'".
val (>>) : 'a output -> (unit -> 'a output) -> 'a output
It is sometimes useful to have a coroutine "nop" that does nothing.
Note that the expression "nop >> fun () -> e" behaves as "e", while
the expression "exit >> (fun ()-> e)" behaves as "exit".
val nop : 'a output
So for instance, the coroutine f below returns the integers 1, 2 and
3, then exits.
# let f =
spawn
(send 1 >> fun () ->
send 2 >> fun () ->
send 3);;
val f : int Coroutines.input = <abstr>
# receive f;;
- : int = 1
# receive f;;
- : int = 2
# receive f;;
- : int = 3
# receive f;;
Exception: Coroutines.Exited.
You can define a coroutine "indexed ~len:n f" which returns
successively "f 0", "f 1", ... "f (n - 1)".
let rec indexed_rec f n l =
if n = l then exit else
send (f n) >> fun () ->
indexed_rec f (n + 1) l
let indexed ?(len = -1) f = spawn (indexed_rec f 0 len)
You can use it to define an array iterator.
let array_iterator a = indexed ~len:(Array.length a) (fun i -> a.(i))
So, "array_iterator [|1;2;3|]" will behave just the same as "f" above.
The coroutine below returns all integers starting from "first".
let integers first = indexed (fun i -> i + first)
As a more complex example, here is the Erathostenes sieve implemented
using coroutines.
let rec filter_rec p st =
let v = receive st in
(if p v then send v else nop) >> fun () ->
filter_rec p st
let filter p st = spawn (filter_rec p st)
let filter_multiples n st = filter (fun m -> not (m mod n = 0)) st
let rec primes_rec st =
let n = receive st in
send n >> fun () ->
primes_rec (filter_multiples n st)
let primes = spawn (primes_rec (integers 2))
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Caml-list] Coroutines
2002-10-06 18:13 ` [Caml-list] Coroutines Jerome Vouillon
@ 2002-10-06 19:46 ` Chris Hecker
2002-10-12 15:58 ` John Max Skaller
1 sibling, 0 replies; 8+ messages in thread
From: Chris Hecker @ 2002-10-06 19:46 UTC (permalink / raw)
To: Jerome Vouillon; +Cc: caml-list
>I just spent a few hours implementing a small coroutine library. It
>is fully written in Ocaml. Below is a quick description. Would it
>satisfy your needs ? I can send you a copy, or make it available on the
>Web if you like.
Ah, it looks like your library is done with cps. The problem is, how would
you yield in the middle of a for-loop or something? I assume the answer is
"don't do that" :), which is valid, but a bit frustrating if you just want
to turn any bit of code into a fiber and be able to yield anywhere it's
convenient. On the upside, you can do yours in vanilla caml. Definitely
post your library on the web, I'd be interested in seeing it. Thanks!
I realized after I sent my post about the stack that I can easily prototype
my Fiber library with Threads. I'll call what I want Fibers to
differentiate them from Coroutines where there's value passing a la Knuth,
and to imply that Fibers are imperative in nature. I can make a thread per
fiber, and then have a mutex per thread, and have my yield/switch calls do
the right mutex silliness to make the threads behave cooperatively rather
than preemptively.
This is a bit heavyweight for what I want in the systhreads case (I'd like
Fibers to be very lightweight and cheap, since they don't need any os
machinery, just a quick context load), but it will allow a fully functional
(er, operational :) prototype so I can see if this is what I really
want. I could ease a bit of the efficiency concerns by adding a thread api
for turning off the tick thread in the case where no "real" threads are
created, since mine will never be able to be preempted anyway so setting
young_limit and the signal is just a waste of time.
Chris
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Caml-list] Coroutines
2002-10-06 18:13 ` [Caml-list] Coroutines Jerome Vouillon
2002-10-06 19:46 ` Chris Hecker
@ 2002-10-12 15:58 ` John Max Skaller
1 sibling, 0 replies; 8+ messages in thread
From: John Max Skaller @ 2002-10-12 15:58 UTC (permalink / raw)
To: caml-list
Jerome Vouillon wrote:
> On Fri, Oct 04, 2002 at 03:25:44AM -0700, Chris Hecker wrote:
>
>>I'm looking at implementing fibers/coroutines/cooperative-threads
>>(described below for reference) in ocaml, and I've run into a small issue,
>>a question, and a couple confusions.
>>
>
> I just spent a few hours implementing a small coroutine library. It
> is fully written in Ocaml. Below is a quick description. Would it
> satisfy your needs ? I can send you a copy, or make it available on the
> Web if you like.
>
> -- Jérôme
>
> A coroutine returning values of type 'a has type 'a output.
>
> type 'a output
>
> You can spawn it. You then get an handle of type 'a input
>
> type 'a input
> val spawn : 'a output -> 'a input
>
> The caller can get a value from the coroutine.
>
> val receive : 'a input -> 'a
>
> A coroutine can return a value of type 'a.
>
> val send : 'a -> 'a output
>
These aren't general coroutines, but a subclass I'd call
'filters' corresponding to unix processes with standard input
and output.
[Felix coroutines are even more restricted .. they can read input,
but there is no provision for output.]
Despite the apparent lack of generality,
a very wide class of coroutines can be implemented by careful
choice of the data type 'a, and 'hand written' switches which
provide further coroutine emulation within each library coroutine.
So I think this library has a good interface, indeed,
I'd consider it for the standard distribution. It is simple
and lightweight.
BTW: I note the interface is 'demand driven' .. you make it
work by reading data from the last stage of a filter chain,
and control ripples back to the first input stage.
Felix, on the other hand, is event driven .. you make it
works by sending data to coroutines ..
'Real' coroutines can read and write on multiple channels,
exactly like threads with blocking I/O except there
is no preemptive switching, obviating the need for
thread safe coding.
Two well known (and very good) implementations:
Windows 3.1 and early Mac OS.
Finally .. you might want to consider JoCaml ..
it is very beautiful (though the implementation using
Posix threads is likely to be slow compared to
native exchange of control primitives)
--
John Max Skaller, mailto:skaller@ozemail.com.au
snail:10/1 Toxteth Rd, Glebe, NSW 2037, Australia.
voice:61-2-9660-0850
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Caml-list] gc question: thread stacks, fibers, etc.
2002-10-04 10:25 [Caml-list] gc question: thread stacks, fibers, etc Chris Hecker
2002-10-06 18:13 ` [Caml-list] Coroutines Jerome Vouillon
@ 2002-10-12 16:33 ` John Max Skaller
2002-10-12 18:54 ` Chris Hecker
2002-10-13 8:32 ` Xavier Leroy
2 siblings, 1 reply; 8+ messages in thread
From: John Max Skaller @ 2002-10-12 16:33 UTC (permalink / raw)
To: caml-list
Chris Hecker wrote:
>
> Fibers/Coroutines/Cooperative-Threads
> In C, you implement them by mallocing a stack and with a little snippet
> of asm, or mucking with the jmpbuf fields. They're pretty trivial to
> get working fairly robustly
.. and fairly useless in most demanding applications due to the
impossibility of dynamically extending/shrinking the stack.
If that isn't a problem .. you might as well just use
real threads ..
Felix solves this problem by using heap allocated stack
frames .. there is a cost in that many procedure calls
must be done indirectly via a driver routine which
maintains the top of stack pointer for each thread.
It also preserves high efficiency by disallowing
blocking I/O (yields) in functional code, which use
the machine stack as usual.
BTW: I consider the technique a case of continuations,
but it is probably more correct to consider it
as a system of resumptions.
--
John Max Skaller, mailto:skaller@ozemail.com.au
snail:10/1 Toxteth Rd, Glebe, NSW 2037, Australia.
voice:61-2-9660-0850
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Caml-list] gc question: thread stacks, fibers, etc.
2002-10-12 16:33 ` [Caml-list] gc question: thread stacks, fibers, etc John Max Skaller
@ 2002-10-12 18:54 ` Chris Hecker
0 siblings, 0 replies; 8+ messages in thread
From: Chris Hecker @ 2002-10-12 18:54 UTC (permalink / raw)
To: John Max Skaller, caml-list
>>In C, you implement them by mallocing a stack and with a little snippet
>>of asm, or mucking with the jmpbuf fields. They're pretty trivial to get
>>working fairly robustly
>.. and fairly useless in most demanding applications due to the
>impossibility of dynamically extending/shrinking the stack.
>If that isn't a problem .. you might as well just use
>real threads ..
Hmm, not sure I understand the last sentence there.
If you care about not having a fixed stack, you can handle it in the same
way a thread handles its stack if you're willing to write os-dependent code
(set guard pages, catch exceptions, etc.). In ocaml, you can probably
realloc in a non-os-dependent way fairly effectively, because you've got
all the stack frames around in the frame tables and there are no pointers
into the stack (well, except if you've called C functions in the middle...hmm).
Ignoring that, one reason you want fibers instead of threads is that you
don't want preemption for some problems because it complicates your code to
handle the asynchrony. If you're saying you should just use non-preemtive
threads (or make them non-preemptive with mutexes), well that's what I'm
calling fibers.
Chris
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Caml-list] gc question: thread stacks, fibers, etc.
2002-10-04 10:25 [Caml-list] gc question: thread stacks, fibers, etc Chris Hecker
2002-10-06 18:13 ` [Caml-list] Coroutines Jerome Vouillon
2002-10-12 16:33 ` [Caml-list] gc question: thread stacks, fibers, etc John Max Skaller
@ 2002-10-13 8:32 ` Xavier Leroy
2002-10-14 7:18 ` Chris Hecker
2 siblings, 1 reply; 8+ messages in thread
From: Xavier Leroy @ 2002-10-13 8:32 UTC (permalink / raw)
To: Chris Hecker; +Cc: caml-list, buzzard
> Is there a way to make the stack just another heap object? I could easily
> put the pointer to the stack in an Abstract_tag block, but then the stack
> won't get scanned by the gc at all, which is bad.
I'm not sure whether you're talking about bytecode stacks or
native-code stacks. For bytecode stacks, all stack slots contain
valid Caml values, hence you could conceptually store the stack in a
zero-tagged block and let the GC scan it. There are several
"gotchas", though:
- The GC will scan the whole memory block, not stopping at the stack
pointer. Hence, some stack slots that have been popped already will
be treated as live pointers, delaying the collection of otherwise dead
blocks.
- Exception handler blocks in the stack are chained using direct
pointers. Hence, relocating the stack (like the minor copying
collector and the heap compactor do) will screw this chaining.
Native-code stacks are much more complex and absolutely require a
special scanning function that interprets the stack frame maps
generated by ocamlopt.
> It seems like what I
> want is another kind of block or another function in the custom_operations
> for "let me manually scan the opaque data in this block, which itself
> points to data that contains values, and if it's all garbage, finalize me".
This is one way to go about your problem, but such custom scanning
functions are not currently supported.
An alternative is to allocate the stack outside the heap, and scan it
via scanning hooks like the thread library does, but manipulate it
from the Caml side through a proxy, heap-allocated custom block. The
custom block has a finalization function that the GC will call when no
reference to the proxy (i.e. to the fiber) remain. The finalization
function can then decide to kill the fiber (free its stack) if it so
pleases.
> My next question is a clarification of the way caml works: the stack can
> contain values, but never blocks, right? In C terms, it can contain
> pointers but not actual objects that other people can point to? All actual
> boxed data is on the heap, right? So, I can just delete a fiber's stack
> and it'll work?
Correct. OCaml never allocates GCed memory blocks in the stack, hence
there are no pointers from the heap into the stack.
> Finally, a couple confusions:
> - final_fun in caml_thread_handle in win32.c is not used?
It is just a placeholder where alloc_final will store a pointer to a
finalization function (caml_thread_finalize). This should be
rewritten to use the new "custom blocks" API.
> - HANDLE in caml_thread_handle in win32 seems odd, since it's scanned by
> the gc, yet it's a windows handle.
No, it's not scanned by the GC because it resides in a block with tag
Custom_tag.
Hope this answers your questions,
- Xavier Leroy
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Caml-list] gc question: thread stacks, fibers, etc.
2002-10-13 8:32 ` Xavier Leroy
@ 2002-10-14 7:18 ` Chris Hecker
0 siblings, 0 replies; 8+ messages in thread
From: Chris Hecker @ 2002-10-14 7:18 UTC (permalink / raw)
To: Xavier Leroy; +Cc: caml-list
>An alternative is to allocate the stack outside the heap, and scan it
>via scanning hooks like the thread library does, but manipulate it
>from the Caml side through a proxy, heap-allocated custom block. The
>custom block has a finalization function that the GC will call when no
>reference to the proxy (i.e. to the fiber) remain. The finalization
>function can then decide to kill the fiber (free its stack) if it so
>pleases.
But this doesn't solve the original problem of the fiber containing a
reference to itself on its stack (or any fiber points to any other fiber),
right? The scanning hook functions assume they're scanning roots, so they
can't check for circularity and release the whole mess. The function I
need is something that says "the values I'm sending to the scanner are not
stored in roots, so check for circularities that includes them". Or
something like that. Make sense? Perhaps there's a way to hook that in
the gc? Maybe I'm missing something.
Anyway, ignoring the circularity problem for the moment, deleting the stack
from under a fiber works because the stack itself can never have a pointer
from the heap into it, so it's always okay to delete (according to your
answer to my other question). I guess you'd have to be careful that no C
functions that were still on the stack of that fiber had ever passed a
pointer into their stack to anyone who keeps it elsewhere (which in
ordinary cases is okay, since when writing the C code you assume you'll get
returned-to in an orderly fashion). Maybe that's an okay constraint to
place on code, but a bug of that nature would certainly be hard to find, so
maybe it's not okay (this is why people shouldn't kill threads, nb. the
other thread :).
Hmm, now that I think about it, I can just use an exception for this,
exactly like the other caml-list discussion was wishing it could for
threads. In other words, I'm only ever going to consider toasting a fiber
when it's yielded, so when the fiber gets resumed it'll be in the yield
code. That code can check whether the gc has requested finalization of the
fiber, and throw an exception back up through the fiber. That will solve
all the stack issues, because the exception will unwind it all, and the
fiber can catch it if it wants and do any cleanup.
Yes, that's much cleaner than just killing it anyway. That just leaves the
circularity issue above.
Thanks!
Chris
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2002-10-14 7:18 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2002-10-04 10:25 [Caml-list] gc question: thread stacks, fibers, etc Chris Hecker
2002-10-06 18:13 ` [Caml-list] Coroutines Jerome Vouillon
2002-10-06 19:46 ` Chris Hecker
2002-10-12 15:58 ` John Max Skaller
2002-10-12 16:33 ` [Caml-list] gc question: thread stacks, fibers, etc John Max Skaller
2002-10-12 18:54 ` Chris Hecker
2002-10-13 8:32 ` Xavier Leroy
2002-10-14 7:18 ` Chris Hecker
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox