From: Brian Hurt <bhurt@spnz.org>
To: Markus Mottl <markus.mottl@gmail.com>
Cc: Robert Roessler <roessler@rftp.com>, Caml-list <caml-list@inria.fr>
Subject: Re: [Caml-list] Severe loss of performance due to new signal handling
Date: Mon, 20 Mar 2006 22:04:14 -0600 (CST) [thread overview]
Message-ID: <Pine.LNX.4.63.0603202127070.10435@localhost.localdomain> (raw)
In-Reply-To: <f8560b80603201911s60ed2882md3e2e2f8f3ca004f@mail.gmail.com>
On Mon, 20 Mar 2006, Markus Mottl wrote:
> On 3/20/06, Robert Roessler <roessler@rftp.com> wrote:
>>
>> At the risk of being "irrelevant", I wanted to nail down exactly what
>> assertion is being made here: are we talking about directly executing
>> in assembly code the relevant x86[-64]/ppc/whatever instructions for
>> "read-and-clear", or going through OS-dependent access routines like
>> Windows' InterlockedExchange()?
>
>
> We are talking of the assembly code. See file byterun/signals_machdep.h,
> which contains the corresponding macros.
OK, poking around a little bit in byterun, I'm seeing this peice of code:
for (signal_number = 0; signal_number < NSIG; signal_number++) {
Read_and_clear(signal_state, caml_pending_signals[signal_number]);
if (signal_state) caml_execute_signal(signal_number, 0);
}
with Read_and_clear being defined as:
#if defined(__GNUC__) && defined(__i386__)
#define Read_and_clear(dst,src) \
asm("xorl %0, %0; xchgl %0, %1" \
: "=r" (dst), "=m" (src) \
: "m" (src))
xchgl is the atomic operation (this is always atomic when referencing a
memory location, regardless of the presence or absence of a lock prefix).
Appropos of nothing, a better definition of that macro would be:
#define Read_and_clear(dst,src) \
asm volatile ("xchgl %0, %1" \
: "=r" (dst), "+m" (src) \
: "0" (0))
as this gives gcc the choice of how to move 0 into the register (using an
xor will still be a popular choice, but it'll occassionally do a movl
depending upon instruction scheduling choices).
Some more poking around tells me that NSIG is defined on Linux to be 64.
I think the problem is not doing an atomic operation, but doing 64 of
them. I'd be inclined to move to a bitset implementation- allowing you
to replace 64 atomic instructions with 2.
On the x86, you can use the lock bts instruction to set the bit. Some
implementation like:
#if defined(__GNUC__) && defined(__i386__)
typedef unsigned long sigword_t;
#define Read_and_clear(dst,src) \
asm volatile ("xchgl %0, %1" \
: "=r" (dst), "+m" (src) \
: "0" (0))
#define Set_sigflag(sigflags, NR) \
asm volatile ("lock bts %1, %0" \
: "+m" (*sigflags) \
: "rN" (NR) \
: "cc")
...
#define SIGWORD_BITS (CHAR_BITS * sizeof(sigword_t))
#define NR_SIGWORDS ((NSIG + SIGWORD_BITS - 1)/SIGWORD_BITS)
extern sigword_t caml_pending_signals[NR_SIGWORDS];
for (i = 0; i < NR_SIGWORDS; i++) {
sigword_t temp;
int j;
Read_and_clear(temp, caml_pending_signals[i]);
for (j = 0; temp != 0; j++) {
if ((temp & 1ul) != 0) {
caml_execute_signal((i * SIGWORD_BITS) + j, 0)
}
temp >>= 1;
}
}
This is somewhat more code, but i, j, and temp would all end up in
registers, and it'd be two atomic instructions, not 64.
The x86 assembly code I can dash off from the top of my head. Similiar
bits of assembly can be written for other CPUs- I just have to go dig out
the right books.
Brian
next prev parent reply other threads:[~2006-03-21 4:03 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-03-17 18:39 Markus Mottl
2006-03-17 19:10 ` [Caml-list] " Christophe TROESTLER
2006-03-20 9:29 ` Xavier Leroy
2006-03-20 10:39 ` Oliver Bandel
2006-03-20 12:37 ` Gerd Stolpmann
2006-03-20 13:13 ` Oliver Bandel
2006-03-20 15:54 ` Xavier Leroy
2006-03-20 16:15 ` Markus Mottl
2006-03-20 16:24 ` Will Farr
2006-03-21 1:33 ` Robert Roessler
2006-03-21 3:11 ` Markus Mottl
2006-03-21 4:04 ` Brian Hurt [this message]
2006-03-21 12:54 ` Robert Roessler
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Pine.LNX.4.63.0603202127070.10435@localhost.localdomain \
--to=bhurt@spnz.org \
--cc=caml-list@inria.fr \
--cc=markus.mottl@gmail.com \
--cc=roessler@rftp.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox