Re: Interpreter vs hardware threads

Mailing list for all users of the OCaml language and system.
 help / color / mirror / Atom feed

From: Xavier Leroy <Xavier.Leroy@inria.fr>
To: William Chesters <williamc@paneris.org>, caml-list@inria.fr
Subject: Re: Interpreter vs hardware threads
Date: Mon, 6 Mar 2000 18:35:46 +0100	[thread overview]
Message-ID: <20000306183546.18171@pauillac.inria.fr> (raw)
In-Reply-To: <200003021818.SAA13646@toy.william.bogus>; from William Chesters on Thu, Mar 02, 2000 at 06:18:32PM +0000

>    IIRC, Linux native pthreads, for one, aren't at the moment meant to
> be used in huge numbers---the flood-test performance of those Linux
> Java VMs which map Java threads onto them supposedly suffers compared
> with those that don't.  But Xavier would be able to tell us more about
> that :).

My pleasure :-)  It is true that threads in LinuxThreads are not as
lightweight as some would like, mostly because they map 1-to-1 to
kernel processes, and the Linux kernel limits the number of processes
to 512 (for the super-user) or 256 (for normal users).  Also, kernel
scheduling takes time linear in the number of processes (just like the
OCaml bytecode-thread scheduler, which was strongly inspired by that
of the Linux kernel), so context switching times increase as more
threads are run.

More generally, there are two very different viewpoints on threads
that call for radically different implementations.

The first viewpoint is that threads are a convenient abstraction to
exploit fully some hardware features, such as multiple processors and
overlapping of I/O and computation.  You can exploit multiple
processors and overlap I/O without threads using e.g. communicating
Unix processes and asynchronous I/O.  But it's a pain to program;
threads make the programming of such applications much easier.

In this viewpoint, you never need huge numbers of threads.  For heavy
computation on a multiprocessor, N + 1 or N + 2 threads, where N is
the number of processors, delivers optimal throughput; you're not
going to run any faster with more threads.  For overlapped I/O on
PC-class hardware, 20 threads or so are plenty: if you overlap more
than 20 I/O requests at the same time, the disks and network cards
won't follow, and throughput will not be increased either.

Both LinuxThreads and to a lesser extent Caml threads were designed
with that kind of applications in mind.  Caml threads can't exploit
multiprocessors because the runtime system isn't safe
w.r.t. concurrent execution, but they have proven quite effective for
overlapped I/O, e.g. in the V6 intelligent Web proxy.

The other viewpoint is that threads are a code structuring facility,
making it easier to write programs that conceptually perform a lot of
things concurrently, e.g. in response to external stimuli.  In this
viewpoint, threads should be as lightweight as possible (starting a
thread shouldn't be much more expensive than calling a function), and
limited in number only by available memory.

The sad fact is that there doesn't exist any implementation technique
for threads that satisfy both viewpoints at once.  Very lightweight
threads do exist, see e.g. the call/cc-based threads of SML/NJ, but
entail significant performance penalties not only on I/O, but also on
the actual running speed of the sequential code.  "Heavyweight"
threads such as LinuxThreads or Win32 threads are very efficient
w.r.t. I/O, but are expensive to start and to context-switch.
Two-level thread libraries are a compromise that is appealing on
paper, but not that much "lighter" than pure kernel-based threads in
reality.

Add to this the fact that the primitives provided by the underlying OS
affect quite a lot thread libraries.  For instance, some Unixes
provide async I/O notification via signals, and that could be used to
speed up the Caml bytecode-thread scheduler, but not all Unixes
provide them.  Also, if we were certain that the underlying OS
provides native threads, we could build a two-level scheduler for
bytecode threads that would probably be more efficient.  But of course
we have to shoot for the lowest common denominator.

So, don't expect miracles from Caml threads, either bytecode or
system.  As Michael Hicks said, the current bytecode thread scheduler
could be improved to run in time proportional to the number of threads
waiting on I/O, rather than on the number of threads.  That would help
if most of your threads are blocked on mutexes, conditions or event
channels, but not if most of your threads perform I/O.

>    (Of course, one could always ask the ocaml team that won the ICFP
> competition in such spectacular fashion to knock off an
> ocaml-to-event-driven-FSM compiler ...)

I'm not sure that OCaml is the best input language for generating
FSMs, but, yes, generating FSMs from a high-level concurrent language
is an excellent approach.  You get both ultimate execution speed and a
readable, maintainable description of the program.  Have a look for
instance at Esterel (another glorious INRIA product :-))

- Xavier Leroy

next prev parent reply	other threads:[~2000-03-06 18:01 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2000-02-25 13:37 mixing Ocaml with another GC-ed language STARYNKEVITCH Basile
2000-02-29 10:24 ` Xavier Leroy
2000-03-01  3:48 ` Nice feature Max Skaller
2000-03-01  3:52 ` Interpreter vs hardware threads Max Skaller
2000-03-01 18:55   ` Michael Hicks
2000-03-01 20:02   ` Stefan Monnier
2000-03-02 18:18     ` William Chesters
2000-03-03 16:59       ` John Max Skaller
2000-03-06 17:35       ` Xavier Leroy [this message]
2000-03-06 23:11         ` John Max Skaller
2000-03-07 13:19         ` Julian Assange
2000-03-08 20:12           ` Xavier Leroy
2000-03-19  6:10             ` Julian Assange
2000-03-08 23:30           ` Max Skaller
2000-03-02 17:25 David Gurr
2000-03-03  9:10 Edwin Young
2000-03-06 13:42 ` William Chesters
2000-03-03 12:12 Juergen Pfitzenmaier
2000-03-07  8:30 Simon Peyton-Jones
2000-03-08 20:02 ` Xavier Leroy
2000-03-12  1:45   ` Julian Assange
2000-03-07  9:06 Simon Peyton-Jones
2000-03-10 18:01 Manuel Fahndrich

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20000306183546.18171@pauillac.inria.fr \
    --to=xavier.leroy@inria.fr \
    --cc=caml-list@inria.fr \
    --cc=williamc@paneris.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox