Re: possible deadlock in perf_event_detach_bpf_prog

From: Daniel Borkmann
Date: Thu Mar 29 2018 - 17:18:51 EST


On 03/29/2018 11:04 PM, syzbot wrote:
> Hello,
>
> syzbot hit the following crash on upstream commit
> 3eb2ce825ea1ad89d20f7a3b5780df850e4be274 (Sun Mar 25 22:44:30 2018 +0000)
> Linux 4.16-rc7
> syzbot dashboard link: https://syzkaller.appspot.com/bug?extid=dc5ca0e4c9bfafaf2bae
>
> Unfortunately, I don't have any reproducer for this crash yet.
> Raw console output: https://syzkaller.appspot.com/x/log.txt?id=4742532743299072
> Kernel config: https://syzkaller.appspot.com/x/.config?id=-8440362230543204781
> compiler: gcc (GCC) 7.1.1 20170620
>
> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> Reported-by: syzbot+dc5ca0e4c9bfafaf2bae@xxxxxxxxxxxxxxxxxxxxxxxxx
> It will help syzbot understand when the bug is fixed. See footer for details.
> If you forward the report, please keep this part and the footer.
>
>
> ======================================================
> WARNING: possible circular locking dependency detected
> 4.16.0-rc7+ #3 Not tainted
> ------------------------------------------------------
> syz-executor7/24531 is trying to acquire lock:
> Â(bpf_event_mutex){+.+.}, at: [<000000008a849b07>] perf_event_detach_bpf_prog+0x92/0x3d0 kernel/trace/bpf_trace.c:854
>
> but task is already holding lock:
> Â(&mm->mmap_sem){++++}, at: [<0000000038768f87>] vm_mmap_pgoff+0x198/0x280 mm/util.c:353
>
> which lock already depends on the new lock.
>
>
> the existing dependency chain (in reverse order) is:
>
> -> #1 (&mm->mmap_sem){++++}:
> ÂÂÂÂÂÂ __might_fault+0x13a/0x1d0 mm/memory.c:4571
> ÂÂÂÂÂÂ _copy_to_user+0x2c/0xc0 lib/usercopy.c:25
> ÂÂÂÂÂÂ copy_to_user include/linux/uaccess.h:155 [inline]
> ÂÂÂÂÂÂ bpf_prog_array_copy_info+0xf2/0x1c0 kernel/bpf/core.c:1694
> ÂÂÂÂÂÂ perf_event_query_prog_array+0x1c7/0x2c0 kernel/trace/bpf_trace.c:891

Looks like we should move the two copy_to_user() outside of
bpf_event_mutex section to avoid the deadlock.

> ÂÂÂÂÂÂ _perf_ioctl kernel/events/core.c:4750 [inline]
> ÂÂÂÂÂÂ perf_ioctl+0x3e1/0x1480 kernel/events/core.c:4770
> ÂÂÂÂÂÂ vfs_ioctl fs/ioctl.c:46 [inline]
> ÂÂÂÂÂÂ do_vfs_ioctl+0x1b1/0x1520 fs/ioctl.c:686
> ÂÂÂÂÂÂ SYSC_ioctl fs/ioctl.c:701 [inline]
> ÂÂÂÂÂÂ SyS_ioctl+0x8f/0xc0 fs/ioctl.c:692
> ÂÂÂÂÂÂ do_syscall_64+0x281/0x940 arch/x86/entry/common.c:287
> ÂÂÂÂÂÂ entry_SYSCALL_64_after_hwframe+0x42/0xb7
>
> -> #0 (bpf_event_mutex){+.+.}:
> ÂÂÂÂÂÂ lock_acquire+0x1d5/0x580 kernel/locking/lockdep.c:3920
> ÂÂÂÂÂÂ __mutex_lock_common kernel/locking/mutex.c:756 [inline]
> ÂÂÂÂÂÂ __mutex_lock+0x16f/0x1a80 kernel/locking/mutex.c:893
> ÂÂÂÂÂÂ mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:908
> ÂÂÂÂÂÂ perf_event_detach_bpf_prog+0x92/0x3d0 kernel/trace/bpf_trace.c:854
> ÂÂÂÂÂÂ perf_event_free_bpf_prog kernel/events/core.c:8147 [inline]
> ÂÂÂÂÂÂ _free_event+0xbdb/0x10f0 kernel/events/core.c:4116
> ÂÂÂÂÂÂ put_event+0x24/0x30 kernel/events/core.c:4204
> ÂÂÂÂÂÂ perf_mmap_close+0x60d/0x1010 kernel/events/core.c:5172
> ÂÂÂÂÂÂ remove_vma+0xb4/0x1b0 mm/mmap.c:172
> ÂÂÂÂÂÂ remove_vma_list mm/mmap.c:2490 [inline]
> ÂÂÂÂÂÂ do_munmap+0x82a/0xdf0 mm/mmap.c:2731
> ÂÂÂÂÂÂ mmap_region+0x59e/0x15a0 mm/mmap.c:1646
> ÂÂÂÂÂÂ do_mmap+0x6c0/0xe00 mm/mmap.c:1483
> ÂÂÂÂÂÂ do_mmap_pgoff include/linux/mm.h:2223 [inline]
> ÂÂÂÂÂÂ vm_mmap_pgoff+0x1de/0x280 mm/util.c:355
> ÂÂÂÂÂÂ SYSC_mmap_pgoff mm/mmap.c:1533 [inline]
> ÂÂÂÂÂÂ SyS_mmap_pgoff+0x462/0x5f0 mm/mmap.c:1491
> ÂÂÂÂÂÂ SYSC_mmap arch/x86/kernel/sys_x86_64.c:100 [inline]
> ÂÂÂÂÂÂ SyS_mmap+0x16/0x20 arch/x86/kernel/sys_x86_64.c:91
> ÂÂÂÂÂÂ do_syscall_64+0x281/0x940 arch/x86/entry/common.c:287
> ÂÂÂÂÂÂ entry_SYSCALL_64_after_hwframe+0x42/0xb7
>
> other info that might help us debug this:
>
> ÂPossible unsafe locking scenario:
>
> ÂÂÂÂÂÂ CPU0ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ CPU1
> ÂÂÂÂÂÂ ----ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ ----
> Â lock(&mm->mmap_sem);
> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ lock(bpf_event_mutex);
> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ lock(&mm->mmap_sem);
> Â lock(bpf_event_mutex);
>
> Â*** DEADLOCK ***
>
> 1 lock held by syz-executor7/24531:
> Â#0:Â (&mm->mmap_sem){++++}, at: [<0000000038768f87>] vm_mmap_pgoff+0x198/0x280 mm/util.c:353
>
> stack backtrace:
> CPU: 0 PID: 24531 Comm: syz-executor7 Not tainted 4.16.0-rc7+ #3
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> Call Trace:
> Â__dump_stack lib/dump_stack.c:17 [inline]
> Âdump_stack+0x194/0x24d lib/dump_stack.c:53
> Âprint_circular_bug.isra.38+0x2cd/0x2dc kernel/locking/lockdep.c:1223
> Âcheck_prev_add kernel/locking/lockdep.c:1863 [inline]
> Âcheck_prevs_add kernel/locking/lockdep.c:1976 [inline]
> Âvalidate_chain kernel/locking/lockdep.c:2417 [inline]
> Â__lock_acquire+0x30a8/0x3e00 kernel/locking/lockdep.c:3431
> Âlock_acquire+0x1d5/0x580 kernel/locking/lockdep.c:3920
> Â__mutex_lock_common kernel/locking/mutex.c:756 [inline]
> Â__mutex_lock+0x16f/0x1a80 kernel/locking/mutex.c:893
> Âmutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:908
> Âperf_event_detach_bpf_prog+0x92/0x3d0 kernel/trace/bpf_trace.c:854
> Âperf_event_free_bpf_prog kernel/events/core.c:8147 [inline]
> Â_free_event+0xbdb/0x10f0 kernel/events/core.c:4116
> Âput_event+0x24/0x30 kernel/events/core.c:4204
> Âperf_mmap_close+0x60d/0x1010 kernel/events/core.c:5172
> Âremove_vma+0xb4/0x1b0 mm/mmap.c:172
> Âremove_vma_list mm/mmap.c:2490 [inline]
> Âdo_munmap+0x82a/0xdf0 mm/mmap.c:2731
> Âmmap_region+0x59e/0x15a0 mm/mmap.c:1646
> Âdo_mmap+0x6c0/0xe00 mm/mmap.c:1483
> Âdo_mmap_pgoff include/linux/mm.h:2223 [inline]
> Âvm_mmap_pgoff+0x1de/0x280 mm/util.c:355
> ÂSYSC_mmap_pgoff mm/mmap.c:1533 [inline]
> ÂSyS_mmap_pgoff+0x462/0x5f0 mm/mmap.c:1491
> ÂSYSC_mmap arch/x86/kernel/sys_x86_64.c:100 [inline]
> ÂSyS_mmap+0x16/0x20 arch/x86/kernel/sys_x86_64.c:91
> Âdo_syscall_64+0x281/0x940 arch/x86/entry/common.c:287
> Âentry_SYSCALL_64_after_hwframe+0x42/0xb7
> RIP: 0033:0x454889
> RSP: 002b:00007f5f44fdac68 EFLAGS: 00000246 ORIG_RAX: 0000000000000009
> RAX: ffffffffffffffda RBX: 00007f5f44fdb6d4 RCX: 0000000000454889
> RDX: 0000000000000000 RSI: 0000000000002000 RDI: 0000000020f1f000
> RBP: 000000000072c010 R08: 0000000000000014 R09: 0000000000000000
> R10: 0000000000000011 R11: 0000000000000246 R12: 00000000ffffffff
> R13: 00000000000003f4 R14: 00000000006f7f80 R15: 0000000000000002
> bond0 (unregistering): Released all slaves
> IPVS: ftp: loaded support on port[0] = 21
> IPv6: ADDRCONF(NETDEV_UP): bridge0: link is not ready
> IPv6: ADDRCONF(NETDEV_UP): bond0: link is not ready
> 8021q: adding VLAN 0 to HW filter on device bond0
> IPv6: ADDRCONF(NETDEV_UP): veth0: link is not ready
> IPv6: ADDRCONF(NETDEV_CHANGE): veth0: link becomes ready
> kernel msg: ebtables bug: please report to author: Wrong len argument
> kernel msg: ebtables bug: please report to author: Wrong len argument
> kernel msg: ebtables bug: please report to author: Wrong len argument
> kernel msg: ebtables bug: please report to author: Wrong len argument
> kernel msg: ebtables bug: please report to author: Wrong len argument
> kernel msg: ebtables bug: please report to author: Wrong len argument
> kernel msg: ebtables bug: please report to author: Wrong len argument
>
>
> ---
> This bug is generated by a dumb bot. It may contain errors.
> See https://goo.gl/tpsmEJ for details.
> Direct all questions to syzkaller@xxxxxxxxxxxxxxxxx
>
> syzbot will keep track of this bug report.
> If you forgot to add the Reported-by tag, once the fix for this bug is merged
> into any tree, please reply to this email with:
> #syz fix: exact-commit-title
> To mark this as a duplicate of another syzbot report, please reply with:
> #syz dup: exact-subject-of-another-report
> If it's a one-off invalid bug report, please reply with:
> #syz invalid
> Note: if the crash happens again, it will cause creation of a new bug report.
> Note: all commands must start from beginning of the line in the email body.