Re: [syzbot] [net?] KASAN: slab-use-after-free Read in __ethtool_get_link_ksettings

From: syzbot
Date: Sun Oct 13 2024 - 12:08:11 EST


Hello,

syzbot has tested the proposed patch but the reproducer is still triggering an issue:
BUG: workqueue leaked atomic, lock or RCU: kworker/NUM:NUM[NUM]

BUG: workqueue leaked atomic, lock or RCU: kworker/1:5[6129]
preempt=0x00000000 lock=0->1 RCU=0->0 workfn=smc_ib_port_event_work
1 lock held by kworker/1:5/6129:
#0: ffffffff8fcd1d48 (rtnl_mutex){+.+.}-{3:3}, at: ib_get_eth_speed+0x13c/0x800 drivers/infiniband/core/verbs.c:1991
CPU: 1 UID: 0 PID: 6129 Comm: kworker/1:5 Not tainted 6.12.0-rc2-syzkaller-00002-g615b94746a54-dirty #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024
Workqueue: events smc_ib_port_event_work
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:94 [inline]
dump_stack_lvl+0x241/0x360 lib/dump_stack.c:120
process_one_work kernel/workqueue.c:3250 [inline]
process_scheduled_works+0x1158/0x1850 kernel/workqueue.c:3310
worker_thread+0x870/0xd30 kernel/workqueue.c:3391
kthread+0x2f0/0x390 kernel/kthread.c:389
ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
</TASK>

======================================================
WARNING: possible circular locking dependency detected
6.12.0-rc2-syzkaller-00002-g615b94746a54-dirty #0 Not tainted
------------------------------------------------------
kworker/1:5/6129 is trying to acquire lock:
ffff88801ac80948 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work kernel/workqueue.c:3204 [inline]
ffff88801ac80948 ((wq_completion)events){+.+.}-{0:0}, at: process_scheduled_works+0x93b/0x1850 kernel/workqueue.c:3310

but task is already holding lock:
ffffffff8fcd1d48 (rtnl_mutex){+.+.}-{3:3}, at: ib_get_eth_speed+0x13c/0x800 drivers/infiniband/core/verbs.c:1991

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #1 (rtnl_mutex){+.+.}-{3:3}:
reacquire_held_locks+0x3eb/0x690 kernel/locking/lockdep.c:5350
__lock_release kernel/locking/lockdep.c:5539 [inline]
lock_release+0x396/0xa30 kernel/locking/lockdep.c:5846
process_one_work kernel/workqueue.c:3236 [inline]
process_scheduled_works+0xb70/0x1850 kernel/workqueue.c:3310
worker_thread+0x870/0xd30 kernel/workqueue.c:3391
kthread+0x2f0/0x390 kernel/kthread.c:389
ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244

-> #0 ((wq_completion)events){+.+.}-{0:0}:
check_prev_add kernel/locking/lockdep.c:3161 [inline]
check_prevs_add kernel/locking/lockdep.c:3280 [inline]
validate_chain+0x18ef/0x5920 kernel/locking/lockdep.c:3904
__lock_acquire+0x1384/0x2050 kernel/locking/lockdep.c:5202
lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5825
process_one_work kernel/workqueue.c:3204 [inline]
process_scheduled_works+0x950/0x1850 kernel/workqueue.c:3310
worker_thread+0x870/0xd30 kernel/workqueue.c:3391
kthread+0x2f0/0x390 kernel/kthread.c:389
ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244

other info that might help us debug this:

Possible unsafe locking scenario:

CPU0 CPU1
---- ----
lock(rtnl_mutex);
lock((wq_completion)events);
lock(rtnl_mutex);
lock((wq_completion)events);

*** DEADLOCK ***

1 lock held by kworker/1:5/6129:
#0: ffffffff8fcd1d48 (rtnl_mutex){+.+.}-{3:3}, at: ib_get_eth_speed+0x13c/0x800 drivers/infiniband/core/verbs.c:1991

stack backtrace:
CPU: 1 UID: 0 PID: 6129 Comm: kworker/1:5 Not tainted 6.12.0-rc2-syzkaller-00002-g615b94746a54-dirty #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024
Workqueue: events psi_avgs_work
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:94 [inline]
dump_stack_lvl+0x241/0x360 lib/dump_stack.c:120
print_circular_bug+0x13a/0x1b0 kernel/locking/lockdep.c:2074
check_noncircular+0x36a/0x4a0 kernel/locking/lockdep.c:2206
check_prev_add kernel/locking/lockdep.c:3161 [inline]
check_prevs_add kernel/locking/lockdep.c:3280 [inline]
validate_chain+0x18ef/0x5920 kernel/locking/lockdep.c:3904
__lock_acquire+0x1384/0x2050 kernel/locking/lockdep.c:5202
lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5825
process_one_work kernel/workqueue.c:3204 [inline]
process_scheduled_works+0x950/0x1850 kernel/workqueue.c:3310
worker_thread+0x870/0xd30 kernel/workqueue.c:3391
kthread+0x2f0/0x390 kernel/kthread.c:389
ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
</TASK>
BUG: workqueue leaked atomic, lock or RCU: kworker/1:5[6129]
preempt=0x00000000 lock=1->0 RCU=0->0 workfn=psi_avgs_work
INFO: lockdep is turned off.
CPU: 1 UID: 0 PID: 6129 Comm: kworker/1:5 Not tainted 6.12.0-rc2-syzkaller-00002-g615b94746a54-dirty #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024
Workqueue: events psi_avgs_work
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:94 [inline]
dump_stack_lvl+0x241/0x360 lib/dump_stack.c:120
process_one_work kernel/workqueue.c:3250 [inline]
process_scheduled_works+0x1158/0x1850 kernel/workqueue.c:3310
worker_thread+0x870/0xd30 kernel/workqueue.c:3391
kthread+0x2f0/0x390 kernel/kthread.c:389
ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
</TASK>


Tested on:

commit: 615b9474 RDMA/hns: Disassociate mmap pages for all uct..
git tree: https://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git for-next
console output: https://syzkaller.appspot.com/x/log.txt?x=131e8727980000
kernel config: https://syzkaller.appspot.com/x/.config?x=7cd9e7e4a8a0a15b
dashboard link: https://syzkaller.appspot.com/bug?extid=5fe14f2ff4ccbace9a26
compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
patch: https://syzkaller.appspot.com/x/patch.diff?x=17f7085f980000