From: Hugo Ferreira <hmf@inescporto.pt>
To: Gerd Stolpmann <info@gerd-stolpmann.de>
Cc: Eray Ozkural <examachine@gmail.com>,
Martin Jambon <martin.jambon@ens-lyon.org>,
caml-list@inria.fr
Subject: Re: [Caml-list] Efficient OCaml multicore -- roadmap?
Date: Wed, 20 Apr 2011 08:59:52 +0100 [thread overview]
Message-ID: <4DAE9278.4050701@inescporto.pt> (raw)
In-Reply-To: <1303244809.8429.1272.camel@thinkpad>
On 04/19/2011 09:26 PM, Gerd Stolpmann wrote:
> Am Dienstag, den 19.04.2011, 12:57 +0300 schrieb Eray Ozkural:
>>
>>
>> On Fri, Mar 25, 2011 at 9:19 PM, Hugo Ferreira<hmf@inescporto.pt>
>> wrote:
>> On 03/25/2011 06:24 PM, Martin Jambon wrote:
>> On 03/25/2011 01:10 PM, Fabrice Le Fessant
>> wrote:
>> Of course, sharing structured
>> mutable data between threads will not
>> be
>> possible, but actually, it is a good
>> thing if you want to write correct
>> programs ;-)
>>
>> On 03/25/11 08:44, Hugo Ferreira replied:
>> I'll stick to my guns here. It simply makes
>> solving certain problem
>> unfeasible. Point in case: I work on machine
>> learning algorithms. I
>> use large data-structures that must be
>> processed (altered)
>> in order to learn. Because these
>> data-structures are large it become
>> impractical to copy this to a process every
>> time I start off a new
>> "thread".
>>
>> The solution would be to use get/set via a
>> message-passing interface.
>>
>>
>>
>> Cannot see how this works. Say I want to share a balanced
>> binary tree.
>> Several processes/threads each take this tree and alter it by
>> adding and
>> deleting elements. Each (new) tree is then further processed
>> by other
>> processes/threads.
>>
>> How can get/set be used in this scenario?
>>
>>
>>
>>
>> I think it won't have good performance and it won't scale, and it will
>> fail for truly delicate shared memory architectures of the future with
>> thousands of cores....
>>
>>
>> And neither will it support on-chip message passing facilities of
>> those future processors.
>>
>>
>> The shared memory message passing never worked too well, anyway, too
>> many redundant copies. Not fitting for high performance computing.
>>
>>
>> No need at all except for embarrassingly parallel applications. I
>> suppose that's the target, right?
>
> I did experiment a bit in the meantime with Netmulticore [1], my
> implementation of multi-processing with shared memory. Netmulticore
> provides both alterable data structures and a quite efficient message
> passing interface. The experience so far: Message passing wins if you do
> it the right way. Avoid ping-pong games like get/set - shared memory is
> far better for this kind of operation. The better topology for message
> passing are pipelines where data flows only into one direction.
>
> I don't know how this translates to Hugo's machine learning problem. I
> could imagine a shared data structure is good here for providing the
> starting point for learning. If you run several learning steps in
> parallel, you want to avoid that these steps lock out each other, i.e.
> try to ensure that they affect distinct parts of the matrix. These
> updates would be sent (using message passing) to an update manager,
> which would apply the learning results to the matrix, and compute the
> next version, for which a number of new learning steps would be started.
> I'm just guessing here how it could be done. In my imagination a clever
> combination of both (alterable) shared memory and message passing is the
> way to go.
>
Agreed. Currently I am using messaging via sexplib to send data-sets to
slaves for processing and returns results to a master process the same
way. But my objective is to share data structures and return partial
results to a master process. Returning results requires messaging.
> Hugo, I'd like to do some more experiments into this direction. Is there
> a simple version of machine learning algorithm I could try to
> parallelize?
>
Unfortunately not. The "system" currently distributes the experiments
to 8 machines with 8 CPU's each connected in a cluster. The algorithms
themselves, used in each experiment, do _not_ use parallel processing. I
am now working on this (deal with issue of noisy data). My last
attempt tried to use the Ancient module but failed because I needed to
alter complex data-structures.
Once I get the basic algorithm down, I will try to use parallel
processing again. Lot of work until then 8-(.
Note: a quick look at Netmulticore seems to indicate that I will still
have to jump some hoops.
Hugo
> Gerd
>
> [1] Netmulticore: now available in ocamlnet-3.3.0test1:
> http://blog.camlcity.org/blog/multicore2.html
>
>>
>>
>> Best,
>>
>> --
>> Eray Ozkural, PhD candidate. Comp. Sci. Dept., Bilkent University,
>> Ankara
>> http://groups.yahoo.com/group/ai-philosophy
>> http://myspace.com/arizanesil http://myspace.com/malfunct
>>
>
>
next prev parent reply other threads:[~2011-04-20 8:00 UTC|newest]
Thread overview: 74+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <2054357367.219171.1300974318806.JavaMail.root@zmbs4.inria.fr>
2011-03-24 23:13 ` Fabrice Le Fessant
2011-03-25 0:23 ` [Caml-list] " Sylvain Le Gall
2011-03-25 9:55 ` [Caml-list] " Alain Frisch
2011-03-25 11:44 ` Gerd Stolpmann
[not found] ` <1396338209.232813.1301046980856.JavaMail.root@zmbs4.inria.fr>
2011-03-25 10:23 ` Fabrice Le Fessant
2011-03-25 12:07 ` Gerd Stolpmann
2011-04-16 12:12 ` Jon Harrop
2011-03-25 10:51 ` Hugo Ferreira
2011-03-25 12:25 ` Gerd Stolpmann
2011-03-25 12:58 ` Hugo Ferreira
[not found] ` <341494683.237537.1301057887481.JavaMail.root@zmbs4.inria.fr>
2011-03-25 13:10 ` Fabrice Le Fessant
2011-03-25 13:41 ` Dario Teixeira
2011-03-30 18:12 ` Jon Harrop
2011-03-25 15:44 ` Hugo Ferreira
2011-03-25 18:24 ` Martin Jambon
2011-03-25 19:19 ` Hugo Ferreira
2011-03-25 20:26 ` Gerd Stolpmann
2011-03-26 9:11 ` Hugo Ferreira
2011-03-26 10:31 ` Richard W.M. Jones
2011-03-30 16:56 ` Jon Harrop
2011-03-30 19:24 ` Richard W.M. Jones
2011-04-20 21:44 ` Jon Harrop
2011-04-19 9:57 ` Eray Ozkural
2011-04-19 10:05 ` Hugo Ferreira
2011-04-19 20:26 ` Gerd Stolpmann
2011-04-20 7:59 ` Hugo Ferreira [this message]
2011-04-20 12:30 ` Markus Mottl
2011-04-20 12:53 ` Hugo Ferreira
2011-04-20 13:22 ` Markus Mottl
2011-04-20 14:00 ` Edgar Friendly
2011-04-19 22:49 ` Jon Harrop
2011-03-30 17:02 ` Jon Harrop
2011-04-20 19:23 ` Jon Harrop
2011-04-20 20:05 ` Alexy Khrabrov
2011-04-20 23:00 ` Jon Harrop
[not found] ` <76544177.594058.1303341821437.JavaMail.root@zmbs4.inria.fr>
2011-04-21 7:48 ` Fabrice Le Fessant
2011-04-21 8:35 ` Hugo Ferreira
2011-04-23 17:32 ` Jon Harrop
2011-04-21 9:09 ` Alain Frisch
[not found] ` <20110421.210304.1267840107736400776.Christophe.Troestler+ocaml@umons.ac.be>
2011-04-21 19:53 ` Hezekiah M. Carty
2011-04-22 8:34 ` Alain Frisch
[not found] ` <799994864.610698.1303412613509.JavaMail.root@zmbs4.inria.fr>
2011-04-22 8:06 ` Fabrice Le Fessant
2011-04-22 9:11 ` Gerd Stolpmann
2011-04-23 10:17 ` Eray Ozkural
2011-04-23 13:47 ` Alexy Khrabrov
2011-04-23 17:39 ` Eray Ozkural
2011-04-23 20:18 ` Alexy Khrabrov
2011-04-23 21:18 ` Jon Harrop
2011-04-24 0:33 ` Eray Ozkural
2011-04-28 14:42 ` orbitz
2011-04-23 19:02 ` Jon Harrop
2011-04-22 9:44 ` Vincent Aravantinos
2011-04-21 10:09 ` Philippe Strauss
2011-04-23 17:44 ` Jon Harrop
2011-04-23 17:05 ` Jon Harrop
2011-04-20 20:30 ` Gerd Stolpmann
2011-04-20 23:33 ` Jon Harrop
2011-03-25 20:27 ` Philippe Strauss
2011-04-19 22:47 ` Jon Harrop
[not found] ` <869445701.579183.1303253283515.JavaMail.root@zmbs4.inria.fr>
2011-04-20 9:25 ` Fabrice Le Fessant
2011-03-25 18:45 ` Andrei Formiga
2011-03-30 17:00 ` Jon Harrop
2011-04-13 3:36 ` Lucas Dixon
2011-04-13 13:01 ` Gerd Stolpmann
2011-04-13 13:09 ` Lucas Dixon
2011-04-13 23:04 ` Goswin von Brederlow
2011-04-16 13:54 ` Jon Harrop
2011-03-24 13:44 Alexy Khrabrov
2011-03-24 14:57 ` Gerd Stolpmann
2011-03-24 15:03 ` Joel Reymont
2011-03-24 15:28 ` Guillaume Yziquel
2011-03-24 15:48 ` Gerd Stolpmann
2011-03-24 15:38 ` Gerd Stolpmann
2011-03-25 19:49 ` Richard W.M. Jones
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4DAE9278.4050701@inescporto.pt \
--to=hmf@inescporto.pt \
--cc=caml-list@inria.fr \
--cc=examachine@gmail.com \
--cc=info@gerd-stolpmann.de \
--cc=martin.jambon@ens-lyon.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox