Re: possible deadlock in perf_trace_destroy (2)
From: Steven Rostedt
Date: Mon Jan 22 2018 - 17:20:22 EST
Peter,
Isn't this the same as what you mentioned (and had a hack patch for)
before?
http://lkml.kernel.org/r/20180109133651.GB2369@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
-- Steve
On Mon, 22 Jan 2018 12:57:01 -0800
syzbot
<syzbot+c9695f0404c72c8f0b7abbac660d986af9008fc5@xxxxxxxxxxxxxxxxxxxxxxxxx>
wrote:
> syzbot has found reproducer for the following crash on
> git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git/master commit
> ebdd7b491b8a65d65936e07004caabca4a3c94a0 (Sun Jan 21 23:21:31 2018 +0000)
> Merge branch 'mlxsw-Add-support-for-mirror-action-with-flower'
>
> So far this crash happened 22 times on
> git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git/master,
> linux-next, mmots, upstream.
> C reproducer is attached.
> syzkaller reproducer is attached.
> Raw console output is attached.
> compiler: gcc (GCC) 7.1.1 20170620
> .config is attached.
>
> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> Reported-by:
> syzbot+c9695f0404c72c8f0b7abbac660d986af9008fc5@xxxxxxxxxxxxxxxxxxxxxxxxx
> It will help syzbot understand when the bug is fixed.
>
>
> ======================================================
> WARNING: possible circular locking dependency detected
> 4.15.0-rc8+ #1 Not tainted
> ------------------------------------------------------
> syzkaller926009/3663 is trying to acquire lock:
> (event_mutex){+.+.}, at: [<00000000df5c386a>]
> perf_trace_destroy+0x28/0x100 kernel/trace/trace_event_perf.c:234
>
> but task is already holding lock:
> (&event->child_mutex){+.+.}, at: [<00000000aedc3ceb>]
> perf_event_release_kernel+0x2ea/0xc10 kernel/events/core.c:4260
>
> which lock already depends on the new lock.
>
>
> the existing dependency chain (in reverse order) is:
>
> -> #5 (&event->child_mutex){+.+.}:
> __mutex_lock_common kernel/locking/mutex.c:756 [inline]
> __mutex_lock+0x16f/0x1a80 kernel/locking/mutex.c:893
> mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:908
> perf_event_for_each_child+0x8a/0x150 kernel/events/core.c:4557
> _perf_ioctl kernel/events/core.c:4736 [inline]
> perf_ioctl+0x35a/0x1430 kernel/events/core.c:4748
> vfs_ioctl fs/ioctl.c:46 [inline]
> do_vfs_ioctl+0x1b1/0x1520 fs/ioctl.c:686
> SYSC_ioctl fs/ioctl.c:701 [inline]
> SyS_ioctl+0x8f/0xc0 fs/ioctl.c:692
> entry_SYSCALL_64_fastpath+0x29/0xa0
>
> -> #4 (&cpuctx_mutex){+.+.}:
> __mutex_lock_common kernel/locking/mutex.c:756 [inline]
> __mutex_lock+0x16f/0x1a80 kernel/locking/mutex.c:893
> mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:908
> perf_event_init_cpu+0xb6/0x160 kernel/events/core.c:11076
> perf_event_init+0x4e9/0x549 kernel/events/core.c:11123
> start_kernel+0x4cc/0x819 init/main.c:627
> x86_64_start_reservations+0x2a/0x2c arch/x86/kernel/head64.c:378
> x86_64_start_kernel+0x77/0x7a arch/x86/kernel/head64.c:359
> secondary_startup_64+0xa5/0xb0 arch/x86/kernel/head_64.S:237
>
> -> #3 (pmus_lock){+.+.}:
> __mutex_lock_common kernel/locking/mutex.c:756 [inline]
> __mutex_lock+0x16f/0x1a80 kernel/locking/mutex.c:893
> mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:908
> perf_event_init_cpu+0x2f/0x160 kernel/events/core.c:11070
> cpuhp_invoke_callback+0x2ea/0x1d20 kernel/cpu.c:182
> cpuhp_up_callbacks kernel/cpu.c:477 [inline]
> _cpu_up+0x216/0x510 kernel/cpu.c:1036
> do_cpu_up+0x73/0xa0 kernel/cpu.c:1066
> cpu_up+0x18/0x20 kernel/cpu.c:1074
> smp_init+0x13a/0x152 kernel/smp.c:578
> kernel_init_freeable+0x2fe/0x521 init/main.c:1067
> kernel_init+0x13/0x180 init/main.c:999
> ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:541
>
> -> #2 (cpu_hotplug_lock.rw_sem){++++}:
> percpu_down_read_preempt_disable include/linux/percpu-rwsem.h:36
> [inline]
> percpu_down_read include/linux/percpu-rwsem.h:59 [inline]
> cpus_read_lock+0x42/0x90 kernel/cpu.c:293
> static_key_slow_inc+0x9d/0x3c0 kernel/jump_label.c:123
> tracepoint_add_func kernel/tracepoint.c:223 [inline]
> tracepoint_probe_register_prio+0x80d/0x9a0 kernel/tracepoint.c:283
> tracepoint_probe_register+0x2a/0x40 kernel/tracepoint.c:304
> trace_event_reg+0x167/0x320 kernel/trace/trace_events.c:305
> perf_trace_event_reg kernel/trace/trace_event_perf.c:122 [inline]
> perf_trace_event_init kernel/trace/trace_event_perf.c:197 [inline]
> perf_trace_init+0x4ef/0xab0 kernel/trace/trace_event_perf.c:221
> perf_tp_event_init+0x7d/0xf0 kernel/events/core.c:7959
> perf_try_init_event+0xc9/0x1f0 kernel/events/core.c:9192
> perf_init_event kernel/events/core.c:9230 [inline]
> perf_event_alloc+0x1cc6/0x2b00 kernel/events/core.c:9494
> SYSC_perf_event_open+0x84e/0x2e00 kernel/events/core.c:9949
> SyS_perf_event_open+0x39/0x50 kernel/events/core.c:9835
> entry_SYSCALL_64_fastpath+0x29/0xa0
>
> -> #1 (tracepoints_mutex){+.+.}:
> __mutex_lock_common kernel/locking/mutex.c:756 [inline]
> __mutex_lock+0x16f/0x1a80 kernel/locking/mutex.c:893
> mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:908
> tracepoint_probe_register_prio+0xa0/0x9a0 kernel/tracepoint.c:279
> tracepoint_probe_register+0x2a/0x40 kernel/tracepoint.c:304
> trace_event_reg+0x167/0x320 kernel/trace/trace_events.c:305
> perf_trace_event_reg kernel/trace/trace_event_perf.c:122 [inline]
> perf_trace_event_init kernel/trace/trace_event_perf.c:197 [inline]
> perf_trace_init+0x4ef/0xab0 kernel/trace/trace_event_perf.c:221
> perf_tp_event_init+0x7d/0xf0 kernel/events/core.c:7959
> perf_try_init_event+0xc9/0x1f0 kernel/events/core.c:9192
> perf_init_event kernel/events/core.c:9230 [inline]
> perf_event_alloc+0x1cc6/0x2b00 kernel/events/core.c:9494
> SYSC_perf_event_open+0x84e/0x2e00 kernel/events/core.c:9949
> SyS_perf_event_open+0x39/0x50 kernel/events/core.c:9835
> entry_SYSCALL_64_fastpath+0x29/0xa0
>
> -> #0 (event_mutex){+.+.}:
> lock_acquire+0x1d5/0x580 kernel/locking/lockdep.c:3914
> __mutex_lock_common kernel/locking/mutex.c:756 [inline]
> __mutex_lock+0x16f/0x1a80 kernel/locking/mutex.c:893
> mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:908
> perf_trace_destroy+0x28/0x100 kernel/trace/trace_event_perf.c:234
> tp_perf_event_destroy+0x15/0x20 kernel/events/core.c:7943
> _free_event+0x3bd/0x10f0 kernel/events/core.c:4104
> free_event+0x84/0x150 kernel/events/core.c:4128
> perf_event_release_kernel+0x54e/0xc10 kernel/events/core.c:4272
> perf_release+0x37/0x50 kernel/events/core.c:4298
> __fput+0x327/0x7e0 fs/file_table.c:210
> ____fput+0x15/0x20 fs/file_table.c:244
> task_work_run+0x199/0x270 kernel/task_work.c:113
> exit_task_work include/linux/task_work.h:22 [inline]
> do_exit+0x9bb/0x1ad0 kernel/exit.c:865
> do_group_exit+0x149/0x400 kernel/exit.c:968
> get_signal+0x73f/0x16c0 kernel/signal.c:2335
> do_signal+0x90/0x1eb0 arch/x86/kernel/signal.c:809
> exit_to_usermode_loop+0x214/0x310 arch/x86/entry/common.c:158
> prepare_exit_to_usermode arch/x86/entry/common.c:195 [inline]
> syscall_return_slowpath+0x490/0x550 arch/x86/entry/common.c:264
> entry_SYSCALL_64_fastpath+0x9e/0xa0
>
> other info that might help us debug this:
>
> Chain exists of:
> event_mutex --> &cpuctx_mutex --> &event->child_mutex
>
> Possible unsafe locking scenario:
>
> CPU0 CPU1
> ---- ----
> lock(&event->child_mutex);
> lock(&cpuctx_mutex);
> lock(&event->child_mutex);
> lock(event_mutex);
>
> *** DEADLOCK ***
>
> 2 locks held by syzkaller926009/3663:
> #0: (&ctx->mutex){+.+.}, at: [<000000006557d8ef>]
> perf_event_release_kernel+0x2dc/0xc10 kernel/events/core.c:4259
> #1: (&event->child_mutex){+.+.}, at: [<00000000aedc3ceb>]
> perf_event_release_kernel+0x2ea/0xc10 kernel/events/core.c:4260
>
> stack backtrace:
> CPU: 0 PID: 3663 Comm: syzkaller926009 Not tainted 4.15.0-rc8+ #1
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> Google 01/01/2011
> Call Trace:
> __dump_stack lib/dump_stack.c:17 [inline]
> dump_stack+0x194/0x257 lib/dump_stack.c:53
> print_circular_bug.isra.37+0x2cd/0x2dc kernel/locking/lockdep.c:1218
> check_prev_add kernel/locking/lockdep.c:1858 [inline]
> check_prevs_add kernel/locking/lockdep.c:1971 [inline]
> validate_chain kernel/locking/lockdep.c:2412 [inline]
> __lock_acquire+0x30a8/0x3e00 kernel/locking/lockdep.c:3426
> lock_acquire+0x1d5/0x580 kernel/locking/lockdep.c:3914
> __mutex_lock_common kernel/locking/mutex.c:756 [inline]
> __mutex_lock+0x16f/0x1a80 kernel/locking/mutex.c:893
> mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:908
> perf_trace_destroy+0x28/0x100 kernel/trace/trace_event_perf.c:234
> tp_perf_event_destroy+0x15/0x20 kernel/events/core.c:7943
> _free_event+0x3bd/0x10f0 kernel/events/core.c:4104
> free_event+0x84/0x150 kernel/events/core.c:4128
> perf_event_release_kernel+0x54e/0xc10 kernel/events/core.c:4272
> perf_release+0x37/0x50 kernel/events/core.c:4298
> __fput+0x327/0x7e0 fs/file_table.c:210
> ____fput+0x15/0x20 fs/file_table.c:244
> task_work_run+0x199/0x270 kernel/task_work.c:113
> exit_task_work include/linux/task_work.h:22 [inline]
> do_exit+0x9bb/0x1ad0 kernel/exit.c:865
> do_group_exit+0x149/0x400 kernel/exit.c:968
> get_signal+0x73f/0x16c0 kernel/signal.c:2335
> do_signal+0x90/0x1eb0 arch/x86/kernel/signal.c:809
> exit_to_usermode_loop+0x214/0x310 arch/x86/entry/common.c:158
> prepare_exit_to_usermode arch/x86/entry/common.c:195 [inline]
> syscall_return_slowpath+0x490/0x550 arch/x86/entry/common.c:264
> entry_SYSCALL_64_fastpath+0x9e/0xa0
> RIP: 0033:0x445749
> RSP: 002b:00007febf72a3db8 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
> RAX: fffffffffffffe00 RBX: 00000000006dac3c RCX: 0000000000445749
> RDX: 0000000000000000
>