Re: [PATCH] slab: fix the DEADLOCK issue on l3 alien lock

From: Michael Wang
Date: Mon Sep 10 2012 - 22:50:40 EST


On 09/08/2012 04:39 PM, Pekka Enberg wrote:
> On Fri, Sep 7, 2012 at 1:29 AM, Paul E. McKenney
> <paulmck@xxxxxxxxxxxxxxxxxx> wrote:
>> On Thu, Sep 06, 2012 at 11:05:11AM +0800, Michael Wang wrote:
>>> On 09/05/2012 09:55 PM, Christoph Lameter wrote:
>>>> On Wed, 5 Sep 2012, Michael Wang wrote:
>>>>
>>>>> Since the cachep and cachep->slabp_cache's l3 alien are in the same lock class,
>>>>> fake report generated.
>>>>
>>>> Ahh... That is a key insight into why this occurs.
>>>>
>>>>> This should not happen since we already have init_lock_keys() which will
>>>>> reassign the lock class for both l3 list and l3 alien.
>>>>
>>>> Right. I was wondering why we still get intermitted reports on this.
>>>>
>>>>> This patch will invoke init_lock_keys() after we done enable_cpucache()
>>>>> instead of before to avoid the fake DEADLOCK report.
>>>>
>>>> Acked-by: Christoph Lameter <cl@xxxxxxxxx>
>>>
>>> Thanks for your review.
>>>
>>> And add Paul to the cc list(my skills on mailing is really poor...).
>>
>> Tested-by: Paul E. McKenney <paulmck@xxxxxxxxxxxxxxxxxx>
>
> I'd also like to tag this for the stable tree to avoid bogus lockdep
> reports. How far back in release history should we queue this?
Hi, Pekka

Sorry for the delayed reply, I try to find out the reason for commit
30765b92 but not get it yet, so I add Peter to the cc list.

The below patch for release 3.0.0 is the one to cause the bogus report.

commit 30765b92ada267c5395fc788623cb15233276f5c
Author: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Date: Thu Jul 28 23:22:56 2011 +0200

slab, lockdep: Annotate the locks before using them

Fernando found we hit the regular OFF_SLAB 'recursion' before we
annotate the locks, cure this.

The relevant portion of the stack-trace:

> [ 0.000000] [<c085e24f>] rt_spin_lock+0x50/0x56
> [ 0.000000] [<c04fb406>] __cache_free+0x43/0xc3
> [ 0.000000] [<c04fb23f>] kmem_cache_free+0x6c/0xdc
> [ 0.000000] [<c04fb2fe>] slab_destroy+0x4f/0x53
> [ 0.000000] [<c04fb396>] free_block+0x94/0xc1
> [ 0.000000] [<c04fc551>] do_tune_cpucache+0x10b/0x2bb
> [ 0.000000] [<c04fc8dc>] enable_cpucache+0x7b/0xa7
> [ 0.000000] [<c0bd9d3c>] kmem_cache_init_late+0x1f/0x61
> [ 0.000000] [<c0bba687>] start_kernel+0x24c/0x363
> [ 0.000000] [<c0bba0ba>] i386_start_kernel+0xa9/0xaf

Reported-by: Fernando Lopez-Lezcano <nando@xxxxxxxxxxxxxxxxxx>
Acked-by: Pekka Enberg <penberg@xxxxxxxxxx>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@xxxxxxxxx>
Link: http://lkml.kernel.org/r/1311888176.2617.379.camel@laptop
Signed-off-by: Ingo Molnar <mingo@xxxxxxx>

It moved init_lock_keys() before we build up the alien, so we failed to
reclass it.

Regards,
Michael Wang

>

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/