Re: [PATCH V2 3/3] block: model freeze & enter queue as lock for supporting lockdep

From: Ming Lei
Date: Wed Oct 30 2024 - 03:15:50 EST


On Wed, Oct 30, 2024 at 02:45:03PM +0800, Lai, Yi wrote:
> Hi Ming,
>
> Greetings!
>
> I used Syzkaller and found that there is possible deadlock in __submit_bio in linux-next next-20241029.
>
> After bisection and the first bad commit is:
> "
> f1be1788a32e block: model freeze & enter queue as lock for supporting lockdep
> "
>
> All detailed into can be found at:
> https://github.com/laifryiee/syzkaller_logs/tree/main/241029_183511___submit_bio
> Syzkaller repro code:
> https://github.com/laifryiee/syzkaller_logs/tree/main/241029_183511___submit_bio/repro.c
> Syzkaller repro syscall steps:
> https://github.com/laifryiee/syzkaller_logs/tree/main/241029_183511___submit_bio/repro.prog
> Syzkaller report:
> https://github.com/laifryiee/syzkaller_logs/tree/main/241029_183511___submit_bio/repro.report
> Kconfig(make olddefconfig):
> https://github.com/laifryiee/syzkaller_logs/tree/main/241029_183511___submit_bio/kconfig_origin
> Bisect info:
> https://github.com/laifryiee/syzkaller_logs/tree/main/241029_183511___submit_bio/bisect_info.log
> bzImage:
> https://github.com/laifryiee/syzkaller_logs/raw/refs/heads/main/241029_183511___submit_bio/bzImage_6fb2fa9805c501d9ade047fc511961f3273cdcb5
> Issue dmesg:
> https://github.com/laifryiee/syzkaller_logs/blob/main/241029_183511___submit_bio/6fb2fa9805c501d9ade047fc511961f3273cdcb5_dmesg.log
>
> "
> [ 22.219103] 6.12.0-rc5-next-20241029-6fb2fa9805c5 #1 Not tainted
> [ 22.219512] ------------------------------------------------------
> [ 22.219827] repro/735 is trying to acquire lock:
> [ 22.220066] ffff888010f1a768 (&q->q_usage_counter(io)#25){++++}-{0:0}, at: __submit_bio+0x39f/0x550
> [ 22.220568]
> [ 22.220568] but task is already holding lock:
> [ 22.220884] ffffffff872322a0 (fs_reclaim){+.+.}-{0:0}, at: __alloc_pages_slowpath.constprop.0+0x76b/0x21e0
> [ 22.221453]
> [ 22.221453] which lock already depends on the new lock.
> [ 22.221453]
> [ 22.221862]
> [ 22.221862] the existing dependency chain (in reverse order) is:
> [ 22.222247]
> [ 22.222247] -> #1 (fs_reclaim){+.+.}-{0:0}:
> [ 22.222630] lock_acquire+0x80/0xb0
> [ 22.222920] fs_reclaim_acquire+0x116/0x160
> [ 22.223244] __kmalloc_cache_node_noprof+0x59/0x470
> [ 22.223528] blk_mq_init_tags+0x79/0x1a0
> [ 22.223771] blk_mq_alloc_map_and_rqs+0x1f4/0xdd0
> [ 22.224127] blk_mq_init_sched+0x33d/0x6d0
> [ 22.224376] elevator_init_mq+0x2b2/0x400

It should be addressed by the following patch:

https://lore.kernel.org/linux-block/ZyEGLdg744U_xBjp@fedora/

Thanks,
Ming