From: Hugo Ferreira <hmf@inescporto.pt>
To: Gerd Stolpmann <gerd@gerd-stolpmann.de>
Cc: caml-list@yquem.inria.fr
Subject: Re: [Caml-list] Shared memory parallel application: kernel threads
Date: Fri, 12 Mar 2010 13:36:13 +0000 [thread overview]
Message-ID: <4B9A434D.9010207@inescporto.pt> (raw)
In-Reply-To: <1268397268.17070.242.camel@thinkpad>
Hi,
Gerd Stolpmann wrote:
> On Fr, 2010-03-12 at 11:55 +0000, Hugo Ferreira wrote:
>> Hello,
>>
>> I need to implement (meta) heuristic algorithms that
>> uses parallelism in order to (attempt to) solve a (hard)
>> machine learning problem that is inherently exponential.
>> The aim is to take maximum advantage of the multi-core
>> processors I have access to.
>>
snip
>> My first concern is to take advantage of the multi-cores so:
>>
>> 1. The thread library is not the answer
>> Chapter 24 - "The threads library is implemented by time-sharing on
>> a
>> single processor. It will not take advantage of multi-processor
>> machines." [1]
>>
>> 2. LinuxThreads seems to be what I need
>> "The main strength of this approach is that it can take full
>> advantage of multiprocessors." [2]
>
> I think you mix here several things up. LinuxThreads has nothing to do
> with ocaml. It is an implementation of kernel threads for Linux on the C
> level. It is considered as outdated as of today, and is usually replaced
> by a better implementation (NPTL) that conforms more strictly to the
> POSIX standard.
>
Oops. Silly me.
> Ocaml uses for its multi-threading implementation the multi-threading
> API the OS provides. This might be LinuxThreads or NPTL or something
> else. So, on the lower half of the implementation the threads are kernel
> threads, and multi-core-enabled.
Ok.Should have read more carefully. As stated in the manual "Two
implementations of the threads library are available, depending on the
capabilities of the operating system:" So I have a recent glibc and
therefore "multi-core-enabled" threads.
> However, Ocaml prevents that more than
> one of the kernel threads can run inside its runtime at any time. So
> Ocaml code will always run only on one core (but you can call C code,
> and this can then take full advantage of multi-cores).
>
Ok. I was under the (wrong) impression that the native OS threads did
run simultaneously (multi-core) but were intermittently stopped due to
the GC. So threads won't help.
> This is the primary reason I am going with multi-processing in my
> projects, and why Ocamlnet focuses on it.
>
Understood.
> The Netcamlbox module of Ocamlnet 3 might be interesting for you. Here
> is an example program that mass-multiplies matrices on several cores:
>
> https://godirepo.camlcity.org/svn/lib-ocamlnet2/trunk/code/examples/camlbox/manymult.ml
>
> Netcamlbox can move complex values to shared memory, so you are not
> restricted to bigarrays. The matrix example uses float array array as
> representation. Recursive variants should also be fine.
>
> For providing shared data to all workers, you can simply load it into
> the master process before the children processes are forked off. Another
> option is (especially when it is a lot of data, and you cannot afford to
> have n copies) to create another camlbox in the master process before
> forking, and to copy the shared data into it before forking. This avoids
> that the data is copied at fork time.
>
The main data set is large, so I will opt for the latter.
> One drawback of Netcamlbox is that it is unsafe, and violating the
> programming rules is punished with crashes. (But this also applies, to
> some extent, to multi-threading, only that the rules are different.)
>
Not an issue for me.
Going to read-up on and install ocamlnet3.
Thanks,
Hugo F.
> Gerd
>
>> Issue 1
>>
>> In the manual [3] I see only references to function for the creation
>> and use of processes. I see no calls that allow me to simply generate
>> and assign a function (job) to a thread (such as val create : ('a -> 'b)
>> -> 'a -> t in the Thread module). The unix library where LinuxThreads
>> is now integrated shows the same API. Am I missing something or
>> is their no way to launch "threaded functions" from the Unix module?
>> Naturally I assume that threads and processes are not the same thing.
>>
>> Issue 2
>>
>> If I cannot launch kernel-threads to allow for easy memory sharing, what
>> other options do I have besides netshm? The data I must share is defined
>> by a recursive variant and is not simple numerical data.
>>
>> I would appreciate any comments.
>>
>> TIA,
>> Hugo F.
>>
>>
>> [1] http://caml.inria.fr/pub/docs/manual-ocaml/manual038.html
>> [2] http://pauillac.inria.fr/~xleroy/linuxthreads/
>> [3] http://caml.inria.fr/pub/docs/manual-ocaml/libref/ThreadUnix.html
>> [4] http://caml.inria.fr/pub/docs/manual-ocaml/manual035.html
>>
>>
>>
>> _______________________________________________
>> Caml-list mailing list. Subscription management:
>> http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
>> Archives: http://caml.inria.fr
>> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
>> Bug reports: http://caml.inria.fr/bin/caml-bugs
>>
>
>
next prev parent reply other threads:[~2010-03-12 13:36 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-03-12 11:55 Hugo Ferreira
2010-03-12 12:34 ` [Caml-list] " Gerd Stolpmann
2010-03-12 13:36 ` Hugo Ferreira [this message]
2010-03-12 14:30 ` Sylvain Le Gall
2010-03-12 14:54 ` [Caml-list] " Hugo Ferreira
2010-03-12 23:59 ` Philippe Wang
2010-03-13 9:12 ` Hugo Ferreira
2010-03-13 13:56 ` [Caml-list] " Richard Jones
2010-03-13 14:29 ` Hugo Ferreira
2010-03-13 15:10 ` Richard Jones
2010-03-13 15:37 ` Hugo Ferreira
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4B9A434D.9010207@inescporto.pt \
--to=hmf@inescporto.pt \
--cc=caml-list@yquem.inria.fr \
--cc=gerd@gerd-stolpmann.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox