Re: [syzbot] [netfs?] kernel BUG in iov_iter_revert (2)

From: syzbot
Date: Wed Dec 04 2024 - 09:39:12 EST


Hello,

syzbot has tested the proposed patch but the reproducer is still triggering an issue:
possible deadlock in __submit_bio

======================================================
WARNING: possible circular locking dependency detected
6.13.0-rc1-syzkaller-dirty #0 Not tainted
------------------------------------------------------
kswapd0/75 is trying to acquire lock:
ffff888034c41438 (&q->q_usage_counter(io)#37){++++}-{0:0}, at: __submit_bio+0x2c6/0x560 block/blk-core.c:629

but task is already holding lock:
ffffffff8ea35b00 (fs_reclaim){+.+.}-{0:0}, at: balance_pgdat mm/vmscan.c:6864 [inline]
ffffffff8ea35b00 (fs_reclaim){+.+.}-{0:0}, at: kswapd+0xbf1/0x36f0 mm/vmscan.c:7246

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #1 (fs_reclaim){+.+.}-{0:0}:
lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5849
__fs_reclaim_acquire mm/page_alloc.c:3851 [inline]
fs_reclaim_acquire+0x88/0x130 mm/page_alloc.c:3865
might_alloc include/linux/sched/mm.h:318 [inline]
slab_pre_alloc_hook mm/slub.c:4055 [inline]
slab_alloc_node mm/slub.c:4133 [inline]
__do_kmalloc_node mm/slub.c:4282 [inline]
__kmalloc_node_noprof+0xb2/0x4d0 mm/slub.c:4289
__kvmalloc_node_noprof+0x72/0x190 mm/util.c:650
sbitmap_init_node+0x2d4/0x670 lib/sbitmap.c:132
scsi_realloc_sdev_budget_map+0x2a7/0x460 drivers/scsi/scsi_scan.c:246
scsi_add_lun drivers/scsi/scsi_scan.c:1106 [inline]
scsi_probe_and_add_lun+0x3173/0x4bd0 drivers/scsi/scsi_scan.c:1287
__scsi_add_device+0x228/0x2f0 drivers/scsi/scsi_scan.c:1622
ata_scsi_scan_host+0x236/0x740 drivers/ata/libata-scsi.c:4575
async_run_entry_fn+0xa8/0x420 kernel/async.c:129
process_one_work kernel/workqueue.c:3229 [inline]
process_scheduled_works+0xa66/0x1840 kernel/workqueue.c:3310
worker_thread+0x870/0xd30 kernel/workqueue.c:3391
kthread+0x2f0/0x390 kernel/kthread.c:389
ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244

-> #0 (&q->q_usage_counter(io)#37){++++}-{0:0}:
check_prev_add kernel/locking/lockdep.c:3161 [inline]
check_prevs_add kernel/locking/lockdep.c:3280 [inline]
validate_chain+0x18ef/0x5920 kernel/locking/lockdep.c:3904
__lock_acquire+0x1397/0x2100 kernel/locking/lockdep.c:5226
lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5849
bio_queue_enter block/blk.h:75 [inline]
blk_mq_submit_bio+0x1536/0x2390 block/blk-mq.c:3091
__submit_bio+0x2c6/0x560 block/blk-core.c:629
__submit_bio_noacct_mq block/blk-core.c:710 [inline]
submit_bio_noacct_nocheck+0x4d3/0xe30 block/blk-core.c:739
swap_writepage_bdev_async mm/page_io.c:451 [inline]
__swap_writepage+0x5fc/0x1400 mm/page_io.c:474
swap_writepage+0x8f4/0x1170 mm/page_io.c:289
pageout mm/vmscan.c:689 [inline]
shrink_folio_list+0x3c0e/0x8cb0 mm/vmscan.c:1367
evict_folios+0x5568/0x7be0 mm/vmscan.c:4593
try_to_shrink_lruvec+0x9a6/0xc70 mm/vmscan.c:4789
shrink_one+0x3b9/0x850 mm/vmscan.c:4834
shrink_many mm/vmscan.c:4897 [inline]
lru_gen_shrink_node mm/vmscan.c:4975 [inline]
shrink_node+0x37c5/0x3e50 mm/vmscan.c:5956
kswapd_shrink_node mm/vmscan.c:6785 [inline]
balance_pgdat mm/vmscan.c:6977 [inline]
kswapd+0x1ca9/0x36f0 mm/vmscan.c:7246
kthread+0x2f0/0x390 kernel/kthread.c:389
ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244

other info that might help us debug this:

Possible unsafe locking scenario:

CPU0 CPU1
---- ----
lock(fs_reclaim);
lock(&q->q_usage_counter(io)#37);
lock(fs_reclaim);
rlock(&q->q_usage_counter(io)#37);

*** DEADLOCK ***

1 lock held by kswapd0/75:
#0: ffffffff8ea35b00 (fs_reclaim){+.+.}-{0:0}, at: balance_pgdat mm/vmscan.c:6864 [inline]
#0: ffffffff8ea35b00 (fs_reclaim){+.+.}-{0:0}, at: kswapd+0xbf1/0x36f0 mm/vmscan.c:7246

stack backtrace:
CPU: 0 UID: 0 PID: 75 Comm: kswapd0 Not tainted 6.13.0-rc1-syzkaller-dirty #0
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:94 [inline]
dump_stack_lvl+0x241/0x360 lib/dump_stack.c:120
print_circular_bug+0x13a/0x1b0 kernel/locking/lockdep.c:2074
check_noncircular+0x36a/0x4a0 kernel/locking/lockdep.c:2206
check_prev_add kernel/locking/lockdep.c:3161 [inline]
check_prevs_add kernel/locking/lockdep.c:3280 [inline]
validate_chain+0x18ef/0x5920 kernel/locking/lockdep.c:3904
__lock_acquire+0x1397/0x2100 kernel/locking/lockdep.c:5226
lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5849
bio_queue_enter block/blk.h:75 [inline]
blk_mq_submit_bio+0x1536/0x2390 block/blk-mq.c:3091
__submit_bio+0x2c6/0x560 block/blk-core.c:629
__submit_bio_noacct_mq block/blk-core.c:710 [inline]
submit_bio_noacct_nocheck+0x4d3/0xe30 block/blk-core.c:739
swap_writepage_bdev_async mm/page_io.c:451 [inline]
__swap_writepage+0x5fc/0x1400 mm/page_io.c:474
swap_writepage+0x8f4/0x1170 mm/page_io.c:289
pageout mm/vmscan.c:689 [inline]
shrink_folio_list+0x3c0e/0x8cb0 mm/vmscan.c:1367
evict_folios+0x5568/0x7be0 mm/vmscan.c:4593
try_to_shrink_lruvec+0x9a6/0xc70 mm/vmscan.c:4789
shrink_one+0x3b9/0x850 mm/vmscan.c:4834
shrink_many mm/vmscan.c:4897 [inline]
lru_gen_shrink_node mm/vmscan.c:4975 [inline]
shrink_node+0x37c5/0x3e50 mm/vmscan.c:5956
kswapd_shrink_node mm/vmscan.c:6785 [inline]
balance_pgdat mm/vmscan.c:6977 [inline]
kswapd+0x1ca9/0x36f0 mm/vmscan.c:7246
kthread+0x2f0/0x390 kernel/kthread.c:389
ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
</TASK>


Tested on:

commit: 40384c84 Linux 6.13-rc1
git tree: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git v6.13-rc1
console output: https://syzkaller.appspot.com/x/log.txt?x=101560f8580000
kernel config: https://syzkaller.appspot.com/x/.config?x=58639d2215ba9a07
dashboard link: https://syzkaller.appspot.com/bug?extid=404b4b745080b6210c6c
compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
patch: https://syzkaller.appspot.com/x/patch.diff?x=138c4de8580000