Re: SMP syncronization on AMD processors (broken?)

From: Kirill Korotaev
Date: Thu Oct 13 2005 - 07:29:54 EST


Thanks a lot for the interesting idea provided below.
I will try to implement it.

Kirill

The whole story started when we wrote the following code:

void XXX(void)
{
/* ints disabled */
restart:
spin_lock(&lock);
do_something();
if (!flag)
need_restart = 1;
spin_unlock(&lock);
if (need_restart)
goto restart; <<<< LOOPS 4EVER ON AMD!!!
}

void YYY(void)
{
spin_lock(&lock); <<<< SPINS 4EVER ON AMD!!!
flag = 1;
spin_unlock(&lock);
}

function XXX() starts on CPU0 and begins to loop since flag is not set, then CPU1 calls function YYY() and it turns out that it can't take the lock any arbitrary time.


The right thing to do here is to wait for the flag to be set *outside*
the lock, and then re-validate inside the lock:

void XXX(void)
{
/* ints disabled */
restart:
spin_lock(&lock);
do_something();
if (!flag)
need_restart = 1;
spin_unlock(&lock);
if (need_restart) {
while (!flag)
cpu_relax();
goto restart;
}
}

This way, XXX() keeps the lock dropped for as long as it takes for
YYY() to notice and grab it.


However, I realize that this is of course a simplified case of some real
code, where even *finding* the flag requires the spin lock.

The generic solution is to have a global "progress" counter, which
records "I made progress toward setting flag", that XXX() can
busy-loop on:

int progress;

void XXX(void)
{
int old_progress;
/* ints disabled */
restart:
spin_lock(&lock);
do_something();
if (!flag) {
old_progress = progress;
need_restart = 1;
}
spin_unlock(&lock);
if (need_restart) {
while (progress == old_progress)
cpu_relax();
goto restart;
}
}

void YYY(void)
{
spin_lock(&lock);
flag = 1;
progress++;
spin_unlock(&lock);
}

It may be that in your data structure, there is one or a series of
fields that already exist that you can use for the purpose. The goal
is to merely detect *change*, so you can reacquire the lock and test
definitively. It's okay to read freed memory while doing this, as long as
you can be sure that:
- The memory read won't oops the kernel, and
- You don't end up depending on the value of the freed memory to
get you out of the stall.



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/