Re: [discuss] Re: SMP syncronization on AMD processors (broken?)

From: Andi Kleen
Date: Thu Oct 06 2005 - 09:02:44 EST


On Thursday 06 October 2005 15:46, Andrey Savochkin wrote:
> On Thu, Oct 06, 2005 at 03:32:30PM +0200, Andi Kleen wrote:
> > Kirill Korotaev <dev@xxxxx> writes:
> > > Please help with a not simple question about spin_lock/spin_unlock on
> > > SMP archs. The question is whether concurrent spin_lock()'s should
> > > acquire it in more or less "fair" fashinon or one of CPUs can starve
> > > any arbitrary time while others do reacquire it in a loop.
> >
> > They are not fully fair because of the NUMAness of the system.
> > Same on many other NUMA systems.
> >
> > We considered long ago to use queued locks to avoid this, but
> > they are quite costly for the uncongested case and never seemed worth it.
> >
> > So live with it.
>
> Well, it's hard to swallow...
> It's not about being not fully fair, it's about deadlocks that started
> to appear after code changes inside retry loops...

Don't do that then.

> A practical question is whether there is an "official" way to tell the CPU
> that it should synchronize with memory, or if you have ideas how to make it
> less costly than queued locks.

I don't think there is an way specified in the architecture. So you're
definitely in undocumented system dependent territory if you attempt this.

delay.

Or maybe a write combining access (movnti) follwed with a sfence.


> A theoretical question is how many places in the kernel use such retry
> loops that may start to fail some day (or on some machines)...

We already have such cases - e.g. our rwlocks always had such a deadlock
even on SMP systems. As far as I know it has been reported exactly once on a
64CPU IA64 system, but it wasn't possible to fix it without large scale
changes so it was ignored. I am not aware of the problem ever happening on a
production system.

And in general fairness was never a force of Linux. A lot of subsystems
do resource handling / sharing without taking it into account. And so far
we got away with it.

I'm not saying it's a good thing, but that general strategy
doesn't seem to have hurt us significantly so far and the fixes are usually
worse than the problems.

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/