Re: [syzbot] [bluetooth?] BUG: workqueue leaked atomic, lock or RCU: kworker/u9:NUM[NUM]

From: syzbot
Date: Wed Jul 10 2024 - 19:49:03 EST


syzbot has found a reproducer for the following issue on:

HEAD commit: ef445d1539dd Merge remote-tracking branches 'origin/arm64-..
git tree: git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git for-kernelci
console output: https://syzkaller.appspot.com/x/log.txt?x=1544df69980000
kernel config: https://syzkaller.appspot.com/x/.config?x=79a49b0b9ffd6585
dashboard link: https://syzkaller.appspot.com/bug?extid=733a96463546d3026b60
compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
userspace arch: arm64
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=13768021980000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=176e2c7e980000

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/76e73f830dca/disk-ef445d15.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/d54cbf6f6ee2/vmlinux-ef445d15.xz
kernel image: https://storage.googleapis.com/syzbot-assets/9240e4c79dfc/Image-ef445d15.gz.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+733a96463546d3026b60@xxxxxxxxxxxxxxxxxxxxxxxxx

BUG: workqueue leaked atomic, lock or RCU: kworker/u9:0[52]
preempt=0x00000000 lock=0->1 RCU=0->0 workfn=hci_rx_work
1 lock held by kworker/u9:0/52:
#0: ffff0000d9876518 (&chan->lock/1){+.+.}-{3:3}, at: l2cap_chan_lock include/net/bluetooth/l2cap.h:827 [inline]
#0: ffff0000d9876518 (&chan->lock/1){+.+.}-{3:3}, at: l2cap_conless_channel net/bluetooth/l2cap_core.c:6764 [inline]
#0: ffff0000d9876518 (&chan->lock/1){+.+.}-{3:3}, at: l2cap_recv_frame+0x610/0xc934 net/bluetooth/l2cap_core.c:6830
CPU: 1 PID: 52 Comm: kworker/u9:0 Not tainted 6.10.0-rc7-syzkaller-gef445d1539dd #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 06/07/2024
Workqueue: hci0 hci_rx_work
Call trace:
dump_backtrace+0x1b8/0x1e4 arch/arm64/kernel/stacktrace.c:317
show_stack+0x2c/0x3c arch/arm64/kernel/stacktrace.c:324
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0xe4/0x150 lib/dump_stack.c:114
dump_stack+0x1c/0x28 lib/dump_stack.c:123
process_one_work+0xdc0/0x15b8 kernel/workqueue.c:3269
process_scheduled_works kernel/workqueue.c:3329 [inline]
worker_thread+0x938/0xecc kernel/workqueue.c:3409
kthread+0x288/0x310 kernel/kthread.c:389
ret_from_fork+0x10/0x20 arch/arm64/kernel/entry.S:860

======================================================
WARNING: possible circular locking dependency detected
6.10.0-rc7-syzkaller-gef445d1539dd #0 Not tainted
------------------------------------------------------
kworker/u9:0/52 is trying to acquire lock:
ffff0000d385e948 ((wq_completion)hci0#2){+.+.}-{0:0}, at: process_one_work+0x624/0x15b8 kernel/workqueue.c:3222

but task is already holding lock:
ffff0000d9876518 (&chan->lock/1){+.+.}-{3:3}, at: l2cap_chan_lock include/net/bluetooth/l2cap.h:827 [inline]
ffff0000d9876518 (&chan->lock/1){+.+.}-{3:3}, at: l2cap_conless_channel net/bluetooth/l2cap_core.c:6764 [inline]
ffff0000d9876518 (&chan->lock/1){+.+.}-{3:3}, at: l2cap_recv_frame+0x610/0xc934 net/bluetooth/l2cap_core.c:6830

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #1 (&chan->lock/1){+.+.}-{3:3}:
__lock_release kernel/locking/lockdep.c:5468 [inline]
lock_release+0x334/0x9b8 kernel/locking/lockdep.c:5774
process_one_work+0x8a0/0x15b8 kernel/workqueue.c:3255
process_scheduled_works kernel/workqueue.c:3329 [inline]
worker_thread+0x938/0xecc kernel/workqueue.c:3409
kthread+0x288/0x310 kernel/kthread.c:389
ret_from_fork+0x10/0x20 arch/arm64/kernel/entry.S:860

-> #0 ((wq_completion)hci0#2){+.+.}-{0:0}:
check_prev_add kernel/locking/lockdep.c:3134 [inline]
check_prevs_add kernel/locking/lockdep.c:3253 [inline]
validate_chain kernel/locking/lockdep.c:3869 [inline]
__lock_acquire+0x3384/0x763c kernel/locking/lockdep.c:5137
lock_acquire+0x240/0x728 kernel/locking/lockdep.c:5754
process_one_work+0x668/0x15b8 kernel/workqueue.c:3223
process_scheduled_works kernel/workqueue.c:3329 [inline]
worker_thread+0x938/0xecc kernel/workqueue.c:3409
kthread+0x288/0x310 kernel/kthread.c:389
ret_from_fork+0x10/0x20 arch/arm64/kernel/entry.S:860

other info that might help us debug this:

Possible unsafe locking scenario:

CPU0 CPU1
---- ----
lock(&chan->lock/1);
lock((wq_completion)hci0#2);
lock(&chan->lock/1);
lock((wq_completion)hci0#2);

*** DEADLOCK ***

1 lock held by kworker/u9:0/52:
#0: ffff0000d9876518 (&chan->lock/1){+.+.}-{3:3}, at: l2cap_chan_lock include/net/bluetooth/l2cap.h:827 [inline]
#0: ffff0000d9876518 (&chan->lock/1){+.+.}-{3:3}, at: l2cap_conless_channel net/bluetooth/l2cap_core.c:6764 [inline]
#0: ffff0000d9876518 (&chan->lock/1){+.+.}-{3:3}, at: l2cap_recv_frame+0x610/0xc934 net/bluetooth/l2cap_core.c:6830

stack backtrace:
CPU: 1 PID: 52 Comm: kworker/u9:0 Not tainted 6.10.0-rc7-syzkaller-gef445d1539dd #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 06/07/2024
Workqueue: hci0 hci_cmd_timeout
Call trace:
dump_backtrace+0x1b8/0x1e4 arch/arm64/kernel/stacktrace.c:317
show_stack+0x2c/0x3c arch/arm64/kernel/stacktrace.c:324
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0xe4/0x150 lib/dump_stack.c:114
dump_stack+0x1c/0x28 lib/dump_stack.c:123
print_circular_bug+0x150/0x1b8 kernel/locking/lockdep.c:2060
check_noncircular+0x310/0x404 kernel/locking/lockdep.c:2187
check_prev_add kernel/locking/lockdep.c:3134 [inline]
check_prevs_add kernel/locking/lockdep.c:3253 [inline]
validate_chain kernel/locking/lockdep.c:3869 [inline]
__lock_acquire+0x3384/0x763c kernel/locking/lockdep.c:5137
lock_acquire+0x240/0x728 kernel/locking/lockdep.c:5754
process_one_work+0x668/0x15b8 kernel/workqueue.c:3223
process_scheduled_works kernel/workqueue.c:3329 [inline]
worker_thread+0x938/0xecc kernel/workqueue.c:3409
kthread+0x288/0x310 kernel/kthread.c:389
ret_from_fork+0x10/0x20 arch/arm64/kernel/entry.S:860
Bluetooth: hci0: command tx timeout
BUG: workqueue leaked atomic, lock or RCU: kworker/u9:0[52]
preempt=0x00000000 lock=1->0 RCU=0->0 workfn=hci_cmd_timeout
INFO: lockdep is turned off.
CPU: 1 PID: 52 Comm: kworker/u9:0 Not tainted 6.10.0-rc7-syzkaller-gef445d1539dd #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 06/07/2024
Workqueue: hci0 hci_cmd_timeout
Call trace:
dump_backtrace+0x1b8/0x1e4 arch/arm64/kernel/stacktrace.c:317
show_stack+0x2c/0x3c arch/arm64/kernel/stacktrace.c:324
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0xe4/0x150 lib/dump_stack.c:114
dump_stack+0x1c/0x28 lib/dump_stack.c:123
process_one_work+0xdc0/0x15b8 kernel/workqueue.c:3269
process_scheduled_works kernel/workqueue.c:3329 [inline]
worker_thread+0x938/0xecc kernel/workqueue.c:3409
kthread+0x288/0x310 kernel/kthread.c:389
ret_from_fork+0x10/0x20 arch/arm64/kernel/entry.S:860
Bluetooth: hci0: command tx timeout
Bluetooth: hci0: command tx timeout
Bluetooth: hci0: command tx timeout


---
If you want syzbot to run the reproducer, reply with:
#syz test: git://repo/address.git branch-or-commit-hash
If you attach or paste a git patch, syzbot will apply it before testing.