Re: [patch] lockdep: annotate mm/slab.c
From: Arjan van de Ven
Date: Thu Jul 13 2006 - 15:19:37 EST
On Thu, 2006-07-13 at 11:56 -0700, Linus Torvalds wrote:
>
> On Thu, 13 Jul 2006, Pekka Enberg wrote:
> >
> > What's "nested lock" btw? If I understood from the other patch, you're
> > talking about ac->lock. Surely you can't take the same lock twice but
> > it's perfectly legal to take lock as long as the ac instance is
> > different...
>
> Normally, no. You can't take another lock just because the instance is
> different. That causes ABBA deadlocks unless you have some underlying
> _ordering_ of the different lock instances.
and that is the case here fwiw;
the OFF_SLAB metadata thing is tricky when freeing stuff:
for kfree() you take the lock for your own slab, do some management and
then do effectively do a kfree() on your off-slab metadata. Which then
takes the lock for the kmalloc slab of the size of your metadata.
This is a natural order, you assume (and pray) that this kmalloc slab is
not in itself a slab with off-slab metadata, so that there can't be a BA
of the ABBA deadlock. [*]
This is what needed the lockdep annotation; lockdep by default thinks
all slabs are equal (by virtue of sharing the spin_lock_init()
location). And because of the slab-internal knowledge of off-slab versus
not-off-slab they're not equal.
Now there are two choices for annotation: tell the spin lock aquisition
that there is a natural hierarchy (eg spin_lock_depends) and this is
what Ingo tried at first. Except that it got horribly complex, messy and
just outright gross.
The second option is telling lockdep that kmalloc slabs without off-slab
metadata are really different in terms of locking rules (eg they have a
separate identity than the other slabs); and that is why my patch does
(or at least tries to; it works for my machines but the slab
initialization code is horribly complex wrt hotplug and numa so I may
have missed a corner case initialization).
The downside of this 'separate identity' is that you basically split the
lock history, meaning that the graph of past history is not built up as
fast as it could have been. Slab is used so much that that isn't a big
deal; for other more exotic cases that's more of a quality issue.
Greetings,
Arjan van de Ven
[*] Note Note Note
there is a corner case in the slab code that I personally don't trust at
all. In the NUMA case, if the memory is not originally from your own
node, the cache_free_alien() function takes, while having your own local
lock, the lock of the remote node as well. (at least on my reading of
the code) to free the memory to that node. I have yet to see where in
the code it safeguards against that remote node doing the exact same
thing in the opposite direction concurrently, and causing a basic ABBA
deadlock.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/