Re: queued spinlock code and results

From: Nick Piggin
Date: Sun Jul 08 2007 - 06:41:15 EST


On Sun, Jul 08, 2007 at 01:18:10PM +0200, Andi Kleen wrote:
> Nick Piggin <npiggin@xxxxxxx> writes:
>
> > I made some tests of the queued spinlock code using userspace test code on
> > 64-bit processors. I believe the xadd based code no longer has any theoretical
> > memory ordering problems.
>
> Linus, the background of this is that on 8 socket Opteron systems
> the current spinlocks can become very unfair to the point of severe
> starvation. These boxes are becomming more common.
>
> > The threaded results also attempt to have an unfairness count, which is the
> > max number of times in a row that a lock is acquired, when all other threads
> > are also executing in the loop -- the reason xadd for example is not always
> > 0 there is because the other threads may not have reached the lock before
> > the current thread was able to get it several times (eg. if an interrupt
> > comes in, this could happen).
>
> Interesting. I was also thinking about switching the lock types
> at boot time. Since all the lock calls are out of line this would
> be reasonably easy.
>
> I would say the main drawback of switchable and queued locks
> would be also that they require a larger spinlock_t thus increasing
> cache usage

Technically the queued locks require twice the size (but I think
the implementation can handle 256 CPUs with 16 bits, while dec based
can only handle 128 with 8 bits -- not a big deal I know, but we'll
probably get there soon).

However currently spinlocks are much bigger than they could be anyway
(4 bytes, could be 1). Although often the alignment of data structures
will make the gain not so big.

But that said, I don't like to justify slightly suboptimal code by
saying that existing code is even less optimal :)

One other upshot of the queued spinlocks is that they don't need
the break_lock field, or any of the associated logic with that
(because it is trivial to test whether a lock is held and also how
many others are spinning on it). So that gives us for free an
avenue into more advanced congestion or spin backoff algorithms.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/