* Faking concurrency using Unix forks and pipes (ray tracing results) @ 2007-06-04 10:56 Jon Harrop 2007-06-04 15:33 ` [Caml-list] " Jonathan Bryant 0 siblings, 1 reply; 6+ messages in thread From: Jon Harrop @ 2007-06-04 10:56 UTC (permalink / raw) To: caml-list I just got a first working version of .NET style asynchronous invocation working in OCaml using process forking. The following OCaml function forks a new process and computes "f x" in that process, returning a function that blocks and returns the result using marshalling. let invoke (f : 'a -> 'b) x : unit -> 'b = let input, output = Unix.pipe() in match Unix.fork() with | 0 -> Unix.close input; let output = Unix.out_channel_of_descr output in Marshal.to_channel output (try `Res(f x) with e -> `Exn e) []; exit 0 | _ -> Unix.close output; let input = Unix.in_channel_of_descr input in fun () -> match Marshal.from_channel input with | `Res x -> x | `Exn e -> raise e This function tries to account for reraising exceptions on the parent process but that is untested. You can write a higher-order "map" function in terms of invoke like this: let ( |> ) x f = f x let map (f : 'a -> 'b) a : 'b array = Array.map (invoke f) a |> Array.map (fun f -> f()) When you apply this map to an array, a new process is forked for each element. As forking is time consuming, you should only apply this to short arrays. The performance characteristics of this approach are very interesting. Firstly, I can observe doubled performance on my dual core by invoking two simple but CPU-intensive operations concurrently: map fib [|43; 43|] However, performance is easily degraded using this approach, partly because forking is expensive but also because of other effects that I do not yet understand. My original benchmark summed the elements of an array using fold_left. For some reason, this is extremely inefficient, as if the entire array is copied. Anyway, this function is so simple that it took no time to work it into my ray tracer benchmark. The benefits of concurrency on my dual-core system reduce the time taken by OCaml from 4s to 3s. I'll try a concurrent F# version and see how it compares... -- Dr Jon D Harrop, Flying Frog Consultancy Ltd. OCaml for Scientists http://www.ffconsultancy.com/products/ocaml_for_scientists/?e ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Caml-list] Faking concurrency using Unix forks and pipes (ray tracing results) 2007-06-04 10:56 Faking concurrency using Unix forks and pipes (ray tracing results) Jon Harrop @ 2007-06-04 15:33 ` Jonathan Bryant 2007-06-04 15:39 ` Jonathan Bryant [not found] ` <9b415f950706040850v586a285ax1448d23c0c78a375@mail.gmail.com> 0 siblings, 2 replies; 6+ messages in thread From: Jonathan Bryant @ 2007-06-04 15:33 UTC (permalink / raw) To: caml-list On Jun 4, 2007, at 6:56 AM, Jon Harrop wrote: > > When you apply this map to an array, a new process is forked for > each element. > As forking is time consuming, you should only apply this to short > arrays. It might be worth your while to implement a process pool. I said before that I fork of one base process from which to fork the others, and it's not possible to make that process just a controller for a pool of waiting processes. --Jonathan ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Caml-list] Faking concurrency using Unix forks and pipes (ray tracing results) 2007-06-04 15:33 ` [Caml-list] " Jonathan Bryant @ 2007-06-04 15:39 ` Jonathan Bryant [not found] ` <9b415f950706040850v586a285ax1448d23c0c78a375@mail.gmail.com> 1 sibling, 0 replies; 6+ messages in thread From: Jonathan Bryant @ 2007-06-04 15:39 UTC (permalink / raw) To: caml-list On Jun 4, 2007, at 11:33 AM, Jonathan Bryant wrote: > process from which to fork the others, and it's not possible to > make that process just a controller for a I don't know why I put the "not" in there... --Jonathan ^ permalink raw reply [flat|nested] 6+ messages in thread
[parent not found: <9b415f950706040850v586a285ax1448d23c0c78a375@mail.gmail.com>]
* Re: [Caml-list] Faking concurrency using Unix forks and pipes (ray tracing results) [not found] ` <9b415f950706040850v586a285ax1448d23c0c78a375@mail.gmail.com> @ 2007-06-04 16:13 ` Jonathan Bryant [not found] ` <9b415f950706040933r22e1560fhc088368ccb8444fa@mail.gmail.com> 0 siblings, 1 reply; 6+ messages in thread From: Jonathan Bryant @ 2007-06-04 16:13 UTC (permalink / raw) To: Benedikt Grundmann; +Cc: caml-list On Jun 4, 2007, at 11:50 AM, Benedikt Grundmann wrote: > Hi Jonathan, > > 2007/6/4, Jonathan Bryant <jtbryant@valdosta.edu>: >> >> On Jun 4, 2007, at 6:56 AM, Jon Harrop wrote: >> >> > >> > When you apply this map to an array, a new process is forked for >> > each element. >> > As forking is time consuming, you should only apply this to short >> > arrays. >> >> It might be worth your while to implement a process pool. I said >> before that I fork of one base process from which to fork the others, >> and it's not possible to make that process just a controller for a >> pool of waiting processes. >> >> --Jonathan >> > > Do you also do that in your work for the OSP? I'm asking cause you > mentioned previously that you want to stay compatible to the ocaml > threads library... and I just can not figure out how you want to > implement Thread.create using that base process... I'm staying with the feel of the threads library, but I'm going to have to make it incompatible. My API uses functors to build everything up and then resembles the existing API. I know there has been some talk on this list about performance issues with that, but I prefer a clean design over speed (at least at the moment). Mainly this is because Marshal is not trustworthy, so every message type needs to have to_ and of_ string functions: module type Message = sig type 'a t val to_string : 'a t -> string val of_string : string -> 'a t end From there it builds up. I'm actually trying to design it in such a way that channels aren't limited to UNIX/TCP sockets, and that the on-wire protocol can be changed as well. Hopefully that will go as planned. As for threads, there are actually 3 levels: concurrent (the existing threads), parallel (forked processes), and remote (forked process on another machine). The level of concurrency is specified at thread creation either by using a special version of Thread.create (such as Thread.concurrent or Thread.parallel) or by an optional parameter to Thread.create. This allows the programmer to use the mechanism needed (threads/processes) in the same manner. One additional change is that threads can either be attached (return a value), or detached (like they are now and return unit), allowing for a future-esque usage of threads. Also, since it's internally just HOFs filled in by the runtime, it allows programs written to run on a cluster (using remote threads) to run in a single process for testing (using just concurrent threads), and then moved to a multi-core / multi-machine environment without any recoding or recompilation. > BTW: Maybe we should subscribe each others project's mailing list? So > tha twe can help and learn from each other? > I'm already on yours :). Mine's http://groups.google.com/group/osp2007-northpole From what you've said, I think we might be covering a lot of the same ground. > > Here is the link to mine: > > http://groups.google.com/group/osp2007-econcurrency > > Cheers, > > Bene > > > -- > Calvin: I try to make everyone's day a little more > surreal. > > (From Calvin & Hobbes) ^ permalink raw reply [flat|nested] 6+ messages in thread
[parent not found: <9b415f950706040933r22e1560fhc088368ccb8444fa@mail.gmail.com>]
* Re: [Caml-list] Faking concurrency using Unix forks and pipes (ray tracing results) [not found] ` <9b415f950706040933r22e1560fhc088368ccb8444fa@mail.gmail.com> @ 2007-06-04 16:53 ` Jonathan Bryant 2007-06-04 18:00 ` Brian Hurt 0 siblings, 1 reply; 6+ messages in thread From: Jonathan Bryant @ 2007-06-04 16:53 UTC (permalink / raw) To: Benedikt Grundmann; +Cc: caml-list On Jun 4, 2007, at 12:33 PM, Benedikt Grundmann wrote: > > It looks a bit more complex, but that's just to avoid big strings in > case of big messages > (e.g. with the "simple" interface you end up with the "same" content > in memory twice). I think big strings are unavoidable in this case. They can be broken up at the protocol level for sending, but a large data structure is going to be marshaled into a big string. As far as same content in memory twice, that should be the case for pure values even in a regular OCaml program. As for mutable values, they shouldn't be sent over a channel to begin with. Once channels are available though, creating a synchronous mutable cell is only a few lines of code. (Check out Reppy's book/papers). > I'm trying to do something similar, but I'm not 100% sure right now > that it will all work out exactly as I initially thought.. :-) You > can have a look at my current "interface", I already have a prototype > commited... src/mailbox.mli is the "main" interface of the library. > The comments are not totally in sync with the interface right now as > I'm restructuring a lot these days. I need to do a commit. I'm using modules which return a record of functions, and you can compose them to get channels that work over multiple mediums. > Okay I'm still a little bit confused how does your Thread.create looks > like (e.g what is > its signature)? Something like: module Thread = struct type con_type = | Concurrent | Parallel | Remote let create ?(con = Concurrent) f x = (* Create thread *) let concurrent f x = create ~con:Concurrent f x let parallel f x = create ~con:Parallel f x let remote f x = create ~con:Remote f x let create_attached ?(con = Concurrent) f x = (* Create thread and return event *) let concurrent_attached f x = create ~con:Concurrent f x let parallel_attached f x = create ~con:Parallel f x let remote_attached f x = create ~con:Remote f x (* Other thread functions *) end I've implemented a good chunk of channels/events so far, but haven't got to the threads yet, so don't quote me on that exact signature. --Jonathan > > Cheers, > > Bene > > -- > Calvin: I try to make everyone's day a little more > surreal. > > (From Calvin & Hobbes) ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Caml-list] Faking concurrency using Unix forks and pipes (ray tracing results) 2007-06-04 16:53 ` Jonathan Bryant @ 2007-06-04 18:00 ` Brian Hurt 0 siblings, 0 replies; 6+ messages in thread From: Brian Hurt @ 2007-06-04 18:00 UTC (permalink / raw) To: Jonathan Bryant; +Cc: Benedikt Grundmann, caml-list Jonathan Bryant wrote: > > On Jun 4, 2007, at 12:33 PM, Benedikt Grundmann wrote: > >> >> It looks a bit more complex, but that's just to avoid big strings in >> case of big messages >> (e.g. with the "simple" interface you end up with the "same" content >> in memory twice). > > > I think big strings are unavoidable in this case. They can be broken > up at the protocol level for sending, but a large data structure is > going to be marshaled into a big string. As far as same content in > memory twice, that should be the case for pure values even in a > regular OCaml program. As for mutable values, they shouldn't be sent > over a channel to begin with. Once channels are available though, > creating a synchronous mutable cell is only a few lines of code. > (Check out Reppy's book/papers). I'm wondering if inversion of control isn't an answer here. The idea is to have a function of type buf_t -> string -> unit. Instead of building up a giant string, the of_* functions would call this function on small strings. Not unlike Buffers. But instead of building up in memory, these function would fill a buffer and then send it off. Brian ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2007-06-04 18:01 UTC | newest] Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2007-06-04 10:56 Faking concurrency using Unix forks and pipes (ray tracing results) Jon Harrop 2007-06-04 15:33 ` [Caml-list] " Jonathan Bryant 2007-06-04 15:39 ` Jonathan Bryant [not found] ` <9b415f950706040850v586a285ax1448d23c0c78a375@mail.gmail.com> 2007-06-04 16:13 ` Jonathan Bryant [not found] ` <9b415f950706040933r22e1560fhc088368ccb8444fa@mail.gmail.com> 2007-06-04 16:53 ` Jonathan Bryant 2007-06-04 18:00 ` Brian Hurt
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox