Re: [PATCH v3] loop: Fix NULL pointer dereference in lo_rw_aio()
From: Ming Lei
Date: Tue May 26 2026 - 21:20:27 EST
On Tue, May 26, 2026 at 09:25:30AM +0900, Tetsuo Handa wrote:
> On 2026/05/26 0:19, Ming Lei wrote:
> > On Mon, May 25, 2026 at 12:40:19PM +0900, Tetsuo Handa wrote:
> >> Some commit which was merged in the merge window for 7.1 broke the loop
> >> driver; a race window where lo_release() clears the backing file via
> >> __loop_clr_fd() despite some I/O requests are pending was introduced [1][2].
> >>
> >> The exact commit which changed the behavior is not known due to lack of
> >> reproducer and timing dependent behavior, but it seems that we need to
> >> solve this problem in the loop driver despite there was no change for the
> >> loop driver during this merge window.
> >>
> >> To close this race, try to flush pending I/O requests. However, calling
> >> drain_workqueue() from __loop_clr_fd() with disk->open_mutex held causes
> >> lockdep warnings [3][4]. We need to flush pending I/O requests without
> >> disk->open_mutex held.
> >
> > No, please don't workaround before root cause.
> >
> > No proof shows that the issue is in block layer or loop driver, the IO isn't
> > expected, you need to figure out why btrfs still issues IO after this loop
> > disk is closed by everyone and writeback is done.
> >
> > https://syzkaller.appspot.com/x/log.txt?x=101e4702580000
> >
>
> Of course we should try to figure out the root cause first, but how can we do?
Definitely unexpected write IO(after umount & loop closed) from btrfs is more serious,
which may cause data loss, so CC btrfs list and maintainer.
...
> Possible approaches for finding the exact commit that is causing this problem:
>
> (a) Revert all changes in the block layer from linux.git and monitor for one week for whether this
> problem is still happening (because linux.git is more frequently hitting this problem than
> linux-next.git ).
>
> (b) Revert all changes in the block layer from linux-next.git and monitor for two weeks for
> whether this problem is still happening (less reliable than linux.git but a candidate).
>
> (c) Let sashiko review all changes between v7.0 and v7.1 that may cause this problem.
> (Human developers have no time to review. But is investigation with moving baseline commit
> possible for sashiko ?)
>
> (d) Any ideas?
>
> P.S. Since the loop driver is a critical infrastructure for testing filesystems by syzbot,
> I want this problem be addressed before 7.1 is released.
syzbot is for finding real problem, here the real trouble is unexpected write IO from btrfs.
So please do not try to paper over real bug by 'fixing' loop.
Thanks,
Ming