Re: SLUB regression in current Linus
From: Linus Torvalds
Date: Tue May 24 2011 - 19:04:03 EST
On Tue, May 24, 2011 at 4:52 AM, James Morris <jmorris@xxxxxxxxx> wrote:
>
> Reverting the patch appears to fix the hang for me, although I'm not sure
> what the actual problem is.
>
> This is on a quad-core Opteron (1352). Let me know if you need any further
> info.
That whole "deactivate_slab()" + "c->page = NULL" that that patch does
looks bogus.
Look at __slab_alloc: we have:
page = c->page;
if (!page)
goto new_slab;
slab_lock(page);
if (unlikely(!node_match(c, node)))
goto another_slab;
and let's assume we have two users racing on that "c->page". The
"slab_lock()" is going to work for one of them, right?
Ok, so the one it works for will then hit
if (kmem_cache_debug(s))
goto debug;
and thus get to the new "deactivate_slab(s,c) + c->page = NULL" and
then unlock the page.
In the meantime, the one that wasn't able to lock the page will now go
forward, but will not have "node_match()" any more, so it does that
"goto another_slab".
Which does "deactivate_slab(s,c)" again, and now c->page is NULL, so
that totally breaks.
What am I missing?
That patch seems to be just broken piece-of-s%^!
Christoph, Pekka, please tell me why I shouldn't immediately revert
it. What am I missing?
Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/