Hi there,

Has anybody used OcamlMPI in conjunction with multi-threading to overlap computation and communication?

During the writing of a parallel information-retrieval code, I have implemented such a feature spawning a new thread for the communications routines that correspond to a block of computation. While that proceeds, I start processing the next block (and I yield the main thread so that some context switching occurs). I think this would not meet any problems had I used a single blocking call in the spawned threads. However, this is not the case. My program performs a quite non-trivial kind of collective communication: a multi-node accumulation with a complex reduction operator, and another complex collective communication with output-partitioning. Naturally, I have to invoke several MPI calls (at least 2*logp presently). And no matter where I yield, my code slows down compared to the single-threaded version. I am suspecting this is partially due to the single-physical-thread feature of the ocaml runtime (and partially due to stress on the local memory). Do you have any recommendations on handling such scenarios? It occurred to me that I might be missing a known solution.

Best Regards,

--
Eray Ozkural, PhD candidate.  Comp. Sci. Dept., Bilkent University, Ankara
http://groups.yahoo.com/group/ai-philosophy