Re: [patch 00/10] mutex subsystem, -V5

From: Ingo Molnar
Date: Thu Dec 22 2005 - 16:39:13 EST



* Christoph Lameter <clameter@xxxxxxxxxxxx> wrote:

> Large NUMA systems with lots of lock contention benefit if the one
> trying to lock backs off for awhile. This is first of all a
> modification of the contention path. However, one could also want to
> locally block multiple lock attempts coming from the same node in
> order to reduce contention on the NUMA interlink. See also my HBO
> proposal
> http://marc.theaimsgroup.com/?l=linux-kernel&m=111696945030791&w=2
>
> I would like some more flexible way of dealing with locks in general.
> The code for the MUTEXes seems to lock us into a specific way of
> realizing locks again.

yeah, but we should be careful where to put it: the perfect place for
most of these features and experiments is in the _generic_ code, not in
arch-level code! Look at how we are doing it with spinlocks. It used to
be a nightmare that every arch had to implement preempt support, or
spinlock debugging support, and they ended up not implementing these
features at all, or only doing it partially.

i definitely do not say that _everything_ should be generalized. That
would be micromanaging things. But i definitely think there's an
unhealthy amount of _under_ generalization in current Linux
architectures, and i dont want the mutex subsystem to fall into that
trap.

also, lets face it, even most of the lowlevel details do have some
generic trends in them. And that's not an accident: it's because
everything is controlled by the laws of physics, which tend to be pretty
generic and work the same way, no matter which CPU architecture we are
on. Yes, there can be oddball architectures, but all in one there's a
good reason why we are able to have 90% of the kernel in generic code.

even lowlevel things such as mutexes can be described generally. We need
roughly 3 states to implement them. We need ops to transit from one
state to another, and to figure out whether that transition succeeded.
Now there are CPUs where we can do almost arbitrary math operations
cheaply, and can test for sign and zero. For those CPUs the decision is
easy: we use the best math, which results in the best code. For other
CPUs we might have to use lesser state transitions, with more
side-effects. Certainly an architecture should have a say in determining
which transitions are the best ones, but the generic code wants to be
able to control the _state_ of the mutex.

What if we want to have a fourth state? [i dont think we want one, but
it could happen] What would you prefer - if there were clear rules about
what the arch level ops do, or if an architecture defined its rules in
assembly, basically hardcoding it and making it impossible to change
without changing the assembly? I went for ops that are generic and
describe the math, not the implementation.

and look at the end result: we have a 400 lines of mutex.c (half of
which is comments) that covers 100% of the architectures, in almost the
most optimal way. [with only a few openings left that could reduce an 8
instruction fastpath to a 6 instruction fastpath.] Lets not give that up
and make it once again a hard to change thing.

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/