Re: [BUG] long freezes on thinkpad t60

From: Ingo Molnar
Date: Thu Jun 21 2007 - 16:10:29 EST



* Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:

> If somebody can actually come up with a sequence where we have
> spinlock starvation, and it's not about an example of bad locking, and
> nobody really can come up with any other way to fix it, we may
> eventually have to add the notion of "fair spinlocks".

there was one bad case i can remember: the spinlock debugging code had a
trylock open-coded loop and on certain Opterons CPUs were starving each
other. This used to trigger with the ->tree_lock rwlock i think, on
heavy MM loads. The starvation got so bad that the NMI watchdog started
triggering ...

interestingly, this only triggered for certain rwlocks. Thus we, after a
few failed attempts to pacify this open-coded loop, currently have that
code disabled in lib/spinlock_debug.c:

#if 0 /* This can cause lockups */
static void __write_lock_debug(rwlock_t *lock)
{
u64 i;
u64 loops = loops_per_jiffy * HZ;
int print_once = 1;

for (;;) {
for (i = 0; i < loops; i++) {
if (__raw_write_trylock(&lock->raw_lock))
return;
__delay(1);
}

the weird thing is that we still have the _very same_ construct in
__spin_lock_debug():

for (i = 0; i < loops; i++) {
if (__raw_spin_trylock(&lock->raw_lock))
return;
__delay(1);
}

if there are any problems with this then people are not complaining loud
enough :-)

note that because this is a trylock based loop, the acquire+release
sequence problem should not apply to this problem.

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/