From: Chris Hecker <checker@d6.com>
To: Damien Doligez <damien.doligez@inria.fr>, caml-list@inria.fr
Subject: Re: [Caml-list] Why systhreads?
Date: Wed, 27 Nov 2002 10:04:55 -0800 [thread overview]
Message-ID: <4.3.2.7.2.20021127090821.032eae90@localhost> (raw)
In-Reply-To: <E4BA1D26-0209-11D7-8024-0003930FCE12@inria.fr>
[sorry for the longwinded response]
>Do you really think so ? In my experience, 95% of the costs of threads
>(with shared memory) are in the debugging (of the threads implementation,
>AND of the programs). Cheap SMP machines and HT do not change the
>cost/benefit equation very much.
Like I said in my previous mail, I think it's going to be similar to
MMX/SSE. The performance improvement you get is not worth the development
and support headache, until the technology is ubiquitous. Once it's
everywhere, it becomes worthwhile. I'm using a middleware library for my
game right now that requires MMX. That's finally an acceptable
requirement. On xbox, which is a fixed platform with a known cpu, every
game uses SSE, because it's just guaranteed to be there, and can make a big
difference if you're willing to work with its problems (using structure of
arrays layout, etc.). And let's not even talk about the insanity of the
PS2 architecture. Xbox2 will use a CPU with HT, because there won't be any
Intel CPUs that don't have HT, so it'll get used there by apps.
Now, as you point out, threads are complicated to design, program, and
debug. I agree with this completely. As I said, I never use threaded
designs if I can avoid it. However, if it becomes very easy to spawn very
small scale parallel threads in C on an HT processor, then it could make a
big performance difference for some algorithms. People are working on C
compilers that have these extensions built in. Intel's got one
now. They'll be first, everyone will ignore it until the installed base is
big enough, and then it'll go into msvc. MMX, SSE, and 3dnow followed the
exact same path.
The reason this is different (or has the potential to be different) with HT
compared to discrete cpus is that a) HT is free so it will be ubiquitous
eventually, and b) HT drops the thread context switch time to 0. It's not
worth starting up a thread on another cpu to do a few instructions worth of
work, but it is conceivable that it would be for HT. Again, I think this
will mirror MMX. The original version of MMX has a horrible context switch
time, and overloaded the FPU registers. It was worthless. They fixed
it. I assume there are similar gotchas with the first version of HT. But,
in a couple revs, they'll fix it and it will be possible to have a second
thread do half the work in a small loop, with no overhead (there'll be a hw
thread pool, hw wait on mutex/sleep, etc.).
The reason HT can make a performance difference is that your app is
stalling in the CPU all the time anyway. Even tight loops aren't memory
bandwidth bound (unless it's a copy or fill), they're memory access bound;
there's a huge difference between the two. HT can take advantage of the
latter and give you way more utilization, even on a smallscale loop. In
theory, anyway. :) But, as I said, I have [non-Intel] colleagues who have
seen big wins with HT on some applications, enough to make them say, "huh,
this actually works!"
Now, you could just say, "hey, caml's not for that kind of lowlevel stuff",
which is a fine response. However, I've been doing a lot of lowlevel stuff
in my game, all in caml (linear algebra, 3d transforms, bitmap operations,
etc.), and it's so close to being good enough to just stay in caml and not
have to drop to C. I understand the point of using the right tool for the
job, but there is overhead (both cognitive and development-process-wise,
both important) associated with hooking something in C, and so it would be
really nice to stay in caml all the time. Bringing this back to HT, this
is the kind of feature that requires inria to do it, because I don't think
anybody else understands the gc. By contrast, I could probably get an SSE
code generator working if I thought it was worth it. But there's no way I
could multithread the gc. :)
>More important, you don't need threads and shared memory to make use
>of a SMP machine. Any kind of parallelism will do. Several processes
>with message-passing can easily get you 100% load on all your processors.
>Also, message-passing is more general; for example it will work on clusters.
Sure, but an HT cpu shares L1 and L2 caches between the threads. This
means that you really want your threads to be working on the same data and
code if you can help it. It'll still work for processes, but you're going
to thrash way more than if you're doing local stuff.
Again, I'm not an HT zealot; I don't even know if it's going to
succeed. But, I do think it has the potential to have a big impact on
performance oriented programming, and it would be great if there's a plan
for supporting it in caml if it actually works. If it's simply not
possible to multithread the gc well, then that's that. But it seems like
something you want to have simmering on the mental back burner in case it
turns out you want it later.
Sorry for the huge post,
Chris
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
next prev parent reply other threads:[~2002-11-27 18:06 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2002-11-23 9:08 Lauri Alanko
2002-11-24 7:36 ` Sven Luther
2002-11-24 17:41 ` Chris Hecker
2002-11-24 18:12 ` Basile STARYNKEVITCH
2002-11-24 21:10 ` Christopher Quinn
2002-11-24 17:14 ` Vitaly Lugovsky
2002-11-24 17:18 ` Lauri Alanko
2002-11-24 18:27 ` Dmitry Bely
2002-11-24 23:14 ` Vitaly Lugovsky
2002-11-27 14:33 ` Tim Freeman
2002-11-29 13:25 ` Vitaly Lugovsky
2002-11-25 10:01 ` Xavier Leroy
2002-11-25 14:20 ` Markus Mottl
2002-11-25 19:01 ` Blair Zajac
2002-11-25 21:06 ` james woodyatt
2002-11-25 22:20 ` Chris Hecker
2002-11-26 6:49 ` Sven Luther
2002-11-27 13:12 ` Damien Doligez
2002-11-27 18:04 ` Chris Hecker [this message]
2002-11-27 21:04 ` Gerd Stolpmann
2002-11-27 21:45 ` [Caml-list] Calling ocaml from external threads Quetzalcoatl Bradley
2002-11-26 9:02 ` [Caml-list] Why systhreads? Xavier Leroy
2002-11-26 9:29 ` Sven Luther
2002-11-26 9:34 ` Xavier Leroy
2002-11-26 9:39 ` Sven Luther
2002-11-26 18:42 ` Chris Hecker
2002-11-26 19:04 ` Dave Berry
2002-11-27 0:07 ` Lauri Alanko
2002-11-26 19:23 Gregory Morrisett
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4.3.2.7.2.20021127090821.032eae90@localhost \
--to=checker@d6.com \
--cc=caml-list@inria.fr \
--cc=damien.doligez@inria.fr \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox