Re: [PATCH V2 3/3] block: model freeze & enter queue as lock for supporting lockdep
From: Lai, Yi
Date: Wed Oct 30 2024 - 04:51:59 EST
On Wed, Oct 30, 2024 at 03:13:09PM +0800, Ming Lei wrote:
> On Wed, Oct 30, 2024 at 02:45:03PM +0800, Lai, Yi wrote:
> > Hi Ming,
> >
> > Greetings!
> >
> > I used Syzkaller and found that there is possible deadlock in __submit_bio in linux-next next-20241029.
> >
> > After bisection and the first bad commit is:
> > "
> > f1be1788a32e block: model freeze & enter queue as lock for supporting lockdep
> > "
> >
> > All detailed into can be found at:
> > https://github.com/laifryiee/syzkaller_logs/tree/main/241029_183511___submit_bio
> > Syzkaller repro code:
> > https://github.com/laifryiee/syzkaller_logs/tree/main/241029_183511___submit_bio/repro.c
> > Syzkaller repro syscall steps:
> > https://github.com/laifryiee/syzkaller_logs/tree/main/241029_183511___submit_bio/repro.prog
> > Syzkaller report:
> > https://github.com/laifryiee/syzkaller_logs/tree/main/241029_183511___submit_bio/repro.report
> > Kconfig(make olddefconfig):
> > https://github.com/laifryiee/syzkaller_logs/tree/main/241029_183511___submit_bio/kconfig_origin
> > Bisect info:
> > https://github.com/laifryiee/syzkaller_logs/tree/main/241029_183511___submit_bio/bisect_info.log
> > bzImage:
> > https://github.com/laifryiee/syzkaller_logs/raw/refs/heads/main/241029_183511___submit_bio/bzImage_6fb2fa9805c501d9ade047fc511961f3273cdcb5
> > Issue dmesg:
> > https://github.com/laifryiee/syzkaller_logs/blob/main/241029_183511___submit_bio/6fb2fa9805c501d9ade047fc511961f3273cdcb5_dmesg.log
> >
> > "
> > [ 22.219103] 6.12.0-rc5-next-20241029-6fb2fa9805c5 #1 Not tainted
> > [ 22.219512] ------------------------------------------------------
> > [ 22.219827] repro/735 is trying to acquire lock:
> > [ 22.220066] ffff888010f1a768 (&q->q_usage_counter(io)#25){++++}-{0:0}, at: __submit_bio+0x39f/0x550
> > [ 22.220568]
> > [ 22.220568] but task is already holding lock:
> > [ 22.220884] ffffffff872322a0 (fs_reclaim){+.+.}-{0:0}, at: __alloc_pages_slowpath.constprop.0+0x76b/0x21e0
> > [ 22.221453]
> > [ 22.221453] which lock already depends on the new lock.
> > [ 22.221453]
> > [ 22.221862]
> > [ 22.221862] the existing dependency chain (in reverse order) is:
> > [ 22.222247]
> > [ 22.222247] -> #1 (fs_reclaim){+.+.}-{0:0}:
> > [ 22.222630] lock_acquire+0x80/0xb0
> > [ 22.222920] fs_reclaim_acquire+0x116/0x160
> > [ 22.223244] __kmalloc_cache_node_noprof+0x59/0x470
> > [ 22.223528] blk_mq_init_tags+0x79/0x1a0
> > [ 22.223771] blk_mq_alloc_map_and_rqs+0x1f4/0xdd0
> > [ 22.224127] blk_mq_init_sched+0x33d/0x6d0
> > [ 22.224376] elevator_init_mq+0x2b2/0x400
>
> It should be addressed by the following patch:
>
> https://lore.kernel.org/linux-block/ZyEGLdg744U_xBjp@fedora/
>
I have applied proposed fix patch on top of next-20241029. Issue can
still be reproduced.
It seems the dependency chain is different from Marek's log and mine.
> Thanks,
> Ming
>