Re: [syzbot] [netfs?] kernel BUG in iov_iter_revert (2)
From: David Howells
Date: Wed Dec 04 2024 - 09:43:35 EST
This looks like it's probably a separate bug.
David
syzbot <syzbot+404b4b745080b6210c6c@xxxxxxxxxxxxxxxxxxxxxxxxx> wrote:
> syzbot has tested the proposed patch but the reproducer is still triggering an issue:
> possible deadlock in __submit_bio
>
> ======================================================
> WARNING: possible circular locking dependency detected
> 6.13.0-rc1-syzkaller-dirty #0 Not tainted
> ------------------------------------------------------
> kswapd0/75 is trying to acquire lock:
> ffff888034c41438 (&q->q_usage_counter(io)#37){++++}-{0:0}, at: __submit_bio+0x2c6/0x560 block/blk-core.c:629
>
> but task is already holding lock:
> ffffffff8ea35b00 (fs_reclaim){+.+.}-{0:0}, at: balance_pgdat mm/vmscan.c:6864 [inline]
> ffffffff8ea35b00 (fs_reclaim){+.+.}-{0:0}, at: kswapd+0xbf1/0x36f0 mm/vmscan.c:7246
>
> which lock already depends on the new lock.
>
>
> the existing dependency chain (in reverse order) is:
>
> -> #1 (fs_reclaim){+.+.}-{0:0}:
> lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5849
> __fs_reclaim_acquire mm/page_alloc.c:3851 [inline]
> fs_reclaim_acquire+0x88/0x130 mm/page_alloc.c:3865
> might_alloc include/linux/sched/mm.h:318 [inline]
> slab_pre_alloc_hook mm/slub.c:4055 [inline]
> slab_alloc_node mm/slub.c:4133 [inline]
> __do_kmalloc_node mm/slub.c:4282 [inline]
> __kmalloc_node_noprof+0xb2/0x4d0 mm/slub.c:4289
> __kvmalloc_node_noprof+0x72/0x190 mm/util.c:650
> sbitmap_init_node+0x2d4/0x670 lib/sbitmap.c:132
> scsi_realloc_sdev_budget_map+0x2a7/0x460 drivers/scsi/scsi_scan.c:246
> scsi_add_lun drivers/scsi/scsi_scan.c:1106 [inline]
> scsi_probe_and_add_lun+0x3173/0x4bd0 drivers/scsi/scsi_scan.c:1287
> __scsi_add_device+0x228/0x2f0 drivers/scsi/scsi_scan.c:1622
> ata_scsi_scan_host+0x236/0x740 drivers/ata/libata-scsi.c:4575
> async_run_entry_fn+0xa8/0x420 kernel/async.c:129
> process_one_work kernel/workqueue.c:3229 [inline]
> process_scheduled_works+0xa66/0x1840 kernel/workqueue.c:3310
> worker_thread+0x870/0xd30 kernel/workqueue.c:3391
> kthread+0x2f0/0x390 kernel/kthread.c:389
> ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
> ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
>
> -> #0 (&q->q_usage_counter(io)#37){++++}-{0:0}:
> check_prev_add kernel/locking/lockdep.c:3161 [inline]
> check_prevs_add kernel/locking/lockdep.c:3280 [inline]
> validate_chain+0x18ef/0x5920 kernel/locking/lockdep.c:3904
> __lock_acquire+0x1397/0x2100 kernel/locking/lockdep.c:5226
> lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5849
> bio_queue_enter block/blk.h:75 [inline]
> blk_mq_submit_bio+0x1536/0x2390 block/blk-mq.c:3091
> __submit_bio+0x2c6/0x560 block/blk-core.c:629
> __submit_bio_noacct_mq block/blk-core.c:710 [inline]
> submit_bio_noacct_nocheck+0x4d3/0xe30 block/blk-core.c:739
> swap_writepage_bdev_async mm/page_io.c:451 [inline]
> __swap_writepage+0x5fc/0x1400 mm/page_io.c:474
> swap_writepage+0x8f4/0x1170 mm/page_io.c:289
> pageout mm/vmscan.c:689 [inline]
> shrink_folio_list+0x3c0e/0x8cb0 mm/vmscan.c:1367
> evict_folios+0x5568/0x7be0 mm/vmscan.c:4593
> try_to_shrink_lruvec+0x9a6/0xc70 mm/vmscan.c:4789
> shrink_one+0x3b9/0x850 mm/vmscan.c:4834
> shrink_many mm/vmscan.c:4897 [inline]
> lru_gen_shrink_node mm/vmscan.c:4975 [inline]
> shrink_node+0x37c5/0x3e50 mm/vmscan.c:5956
> kswapd_shrink_node mm/vmscan.c:6785 [inline]
> balance_pgdat mm/vmscan.c:6977 [inline]
> kswapd+0x1ca9/0x36f0 mm/vmscan.c:7246
> kthread+0x2f0/0x390 kernel/kthread.c:389
> ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
> ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
>
> other info that might help us debug this:
>
> Possible unsafe locking scenario:
>
> CPU0 CPU1
> ---- ----
> lock(fs_reclaim);
> lock(&q->q_usage_counter(io)#37);
> lock(fs_reclaim);
> rlock(&q->q_usage_counter(io)#37);
>
> *** DEADLOCK ***
>
> 1 lock held by kswapd0/75:
> #0: ffffffff8ea35b00 (fs_reclaim){+.+.}-{0:0}, at: balance_pgdat mm/vmscan.c:6864 [inline]
> #0: ffffffff8ea35b00 (fs_reclaim){+.+.}-{0:0}, at: kswapd+0xbf1/0x36f0 mm/vmscan.c:7246
>
> stack backtrace:
> CPU: 0 UID: 0 PID: 75 Comm: kswapd0 Not tainted 6.13.0-rc1-syzkaller-dirty #0
> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
> Call Trace:
> <TASK>
> __dump_stack lib/dump_stack.c:94 [inline]
> dump_stack_lvl+0x241/0x360 lib/dump_stack.c:120
> print_circular_bug+0x13a/0x1b0 kernel/locking/lockdep.c:2074
> check_noncircular+0x36a/0x4a0 kernel/locking/lockdep.c:2206
> check_prev_add kernel/locking/lockdep.c:3161 [inline]
> check_prevs_add kernel/locking/lockdep.c:3280 [inline]
> validate_chain+0x18ef/0x5920 kernel/locking/lockdep.c:3904
> __lock_acquire+0x1397/0x2100 kernel/locking/lockdep.c:5226
> lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5849
> bio_queue_enter block/blk.h:75 [inline]
> blk_mq_submit_bio+0x1536/0x2390 block/blk-mq.c:3091
> __submit_bio+0x2c6/0x560 block/blk-core.c:629
> __submit_bio_noacct_mq block/blk-core.c:710 [inline]
> submit_bio_noacct_nocheck+0x4d3/0xe30 block/blk-core.c:739
> swap_writepage_bdev_async mm/page_io.c:451 [inline]
> __swap_writepage+0x5fc/0x1400 mm/page_io.c:474
> swap_writepage+0x8f4/0x1170 mm/page_io.c:289
> pageout mm/vmscan.c:689 [inline]
> shrink_folio_list+0x3c0e/0x8cb0 mm/vmscan.c:1367
> evict_folios+0x5568/0x7be0 mm/vmscan.c:4593
> try_to_shrink_lruvec+0x9a6/0xc70 mm/vmscan.c:4789
> shrink_one+0x3b9/0x850 mm/vmscan.c:4834
> shrink_many mm/vmscan.c:4897 [inline]
> lru_gen_shrink_node mm/vmscan.c:4975 [inline]
> shrink_node+0x37c5/0x3e50 mm/vmscan.c:5956
> kswapd_shrink_node mm/vmscan.c:6785 [inline]
> balance_pgdat mm/vmscan.c:6977 [inline]
> kswapd+0x1ca9/0x36f0 mm/vmscan.c:7246
> kthread+0x2f0/0x390 kernel/kthread.c:389
> ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
> ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
> </TASK>