Re: lockdep WARNING in get_online_cpus

From: Dmitry Vyukov
Date: Mon Apr 04 2016 - 04:19:35 EST


On Sat, Apr 2, 2016 at 3:02 PM, Dmitry Vyukov <dvyukov@xxxxxxxxxx> wrote:
> Hello,
>
> After syncing to 05cf8077e54b20dddb756eaa26f3aeb5c38dd3cf (Apr 1) I've
> started getting the following warnings while running syzkaller fuzzer:
>
> ------------[ cut here ]------------
> WARNING: CPU: 0 PID: 7252 at kernel/locking/lockdep.c:2017[<
> inline >] check_no_collision kernel/locking/lockdep.c:2017
> WARNING: CPU: 0 PID: 7252 at kernel/locking/lockdep.c:2017[<
> inline >] lookup_chain_cache kernel/locking/lockdep.c:2060
> WARNING: CPU: 0 PID: 7252 at kernel/locking/lockdep.c:2017[<
> inline >] validate_chain kernel/locking/lockdep.c:2132
> WARNING: CPU: 0 PID: 7252 at kernel/locking/lockdep.c:2017[< none
> >] __lock_acquire+0x39a5/0x5ff0 kernel/locking/lockdep.c:3221
> DEBUG_LOCKS_WARN_ON(chain->depth != curr->lockdep_depth - (i - 1))
> Modules linked in:
> CPU: 0 PID: 7252 Comm: syz-executor Tainted: G W 4.6.0-rc1+ #340
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
> ffffffff87eb25c0 ffff8800308cf6b8 ffffffff82c8e9df ffffffff81494638
> fffffbfff0fd64b8 ffff8800308cf730 0000000000000000 ffffffff86aa7700
> ffffffff81474095 0000000000000009 ffff8800308cf700 fffffff2016/04/02
> 12:00:56 executing program 2:
> Call Trace:
> [< inline >] __dump_stack lib/dump_stack.c:15
> [<ffffffff82c8e9df>] dump_stack+0x12e/0x18f lib/dump_stack.c:51
> [<ffffffff8136637f>] __warn+0x19f/0x1e0 kernel/panic.c:512
> [<ffffffff8136646c>] warn_slowpath_fmt+0xac/0xd0 kernel/panic.c:527
> [< inline >] check_no_collision kernel/locking/lockdep.c:2017
> [< inline >] lookup_chain_cache kernel/locking/lockdep.c:2060
> [< inline >] validate_chain kernel/locking/lockdep.c:2132
> [<ffffffff81474095>] __lock_acquire+0x39a5/0x5ff0 kernel/locking/lockdep.c:3221
> [<ffffffff81478ba2>] lock_acquire+0x222/0x4b0 kernel/locking/lockdep.c:3600
> [< inline >] __mutex_lock_common kernel/locking/mutex.c:518
> [<ffffffff867aeb61>] mutex_lock_nested+0xb1/0xa50 kernel/locking/mutex.c:618
> [<ffffffff81367820>] get_online_cpus+0x70/0x90 kernel/cpu.c:174
> [<ffffffff81200ee9>] arch_jump_label_transform+0x69/0x2e0
> arch/x86/kernel/jump_label.c:108
> [<ffffffff8166c468>] __jump_label_update+0x168/0x1d0 kernel/jump_label.c:197
> [<ffffffff8166c697>] jump_label_update+0x1c7/0x250 kernel/jump_label.c:482
> [<ffffffff8166cec1>] __static_key_slow_dec+0xc1/0x120 kernel/jump_label.c:85
> [<ffffffff8166cf71>] static_key_slow_dec+0x51/0x90 kernel/jump_label.c:100
> [< inline >] tracepoint_remove_func kernel/tracepoint.c:248
> [<ffffffff815a4eed>] tracepoint_probe_unregister+0x49d/0x5f0
> kernel/tracepoint.c:319
> [<ffffffff815e213d>] trace_event_reg+0xed/0x340 kernel/trace/trace_events.c:324
> [<ffffffff815eff31>] perf_trace_event_unreg.isra.1+0xe1/0x1f0
> kernel/trace/trace_event_perf.c:152
> [<ffffffff815f0b9e>] perf_trace_destroy+0xbe/0x100
> kernel/trace/trace_event_perf.c:233
> [<ffffffff81639e35>] tp_perf_event_destroy+0x15/0x20 kernel/events/core.c:7041
> [<ffffffff81654f2f>] _free_event+0x23f/0xa40 kernel/events/core.c:3750
> [<ffffffff81655762>] free_event+0x32/0xa0 kernel/events/core.c:3776
> [<ffffffff81658cf5>] SYSC_perf_event_open+0x955/0x21d0
> kernel/events/core.c:8675
> [<ffffffff81660da9>] SyS_perf_event_open+0x39/0x50 kernel/events/core.c:8324
> [<ffffffff867b9400>] entry_SYSCALL_64_fastpath+0x23/0xc1
> arch/x86/entry/entry_64.S:207
> ---[ end trace 23451128bc05ed82 ]---


This happens in CONFIG_DEBUG_LOCKDEP code. Is it a bug in lockdep?
But I always see at the same stack involving perf and jump_label...

/*
* Checks whether the chain and the current held locks are consistent
* in depth and also in content. If they are not it most likely means
* that there was a collision during the calculation of the chain_key.
* Returns: 0 not passed, 1 passed
*/
static int check_no_collision(struct task_struct *curr,
struct held_lock *hlock,
struct lock_chain *chain)
{
#ifdef CONFIG_DEBUG_LOCKDEP
int i, j, id;

i = get_first_held_lock(curr, hlock);

if (DEBUG_LOCKS_WARN_ON(chain->depth != curr->lockdep_depth - (i - 1)))
return 0;