Re: [PATCH] xfs: avoid inodegc worker flush deadlock

From: Dave Chinner

Date: Mon Mar 30 2026 - 16:55:00 EST

On Mon, Mar 30, 2026 at 10:40:13AM +0800, ZhengYuan Huang wrote:
> On Mon, Mar 30, 2026 at 9:41 AM Dave Chinner <dgc@xxxxxxxxxx> wrote:
> > How did the filesystem get to ENOSPC when freeing an inode?
> > That should not happen, so can you please describe what the system
> > was doing to trip over this issue?
> >
> > i.e. the problem that needs to be understood and fixed here is
> > "freeing an inode should never see ENOSPC", not "inodegc should not
> > recurse"...
>
> Thanks for the reply.
>
> This issue was found by our fuzzing tool, and we are still working on
> a reliable reproducer.

Is this some new custom fuzzer tool, or just another private syzbot
instance?

More importantly: this is not a failure that anyone is likely to see
in production systems, right?

> From the logs we have so far, it appears that the filesystem may
> already be falling back to m_finobt_nores during mount, before the
> later inodegc/ifree path is reached.

Which means ifree would have dipped into the reserve block pool
because when mp->m_finobt_nores is set, we use XFS_TRANS_RESERVE for
the ifree transaction reservation.

> In particular, we observe
> repeated per-AG reservation failures during mount, followed by:
>
> ENOSPC reserving per-AG metadata pool, log recovery may fail.

This error doesn't occur in isolation - what other errors were
reported?

Please post the entire log output from the start of the mount to the
actual reported failure. That way we know the same things as you do,
and can make more informed comments about the error rather than
having to rely on what you think is relevant.

> Based on the current code, my understanding is that when
> xfs_fs_reserve_ag_blocks fails, XFS can continue mounting in the
> degraded m_finobt_nores mode. In this state, xfs_inactive_ifree may
> later take the explicit reservation path, which seems like a plausible
> way for ifree to encounter ENOSPC.

The nores path sets XFS_TRANS_RESERVE, allowing it to dip into the
global reserve blocks pool to avoid ENOSPC in most situations.
However, if it gets ENOSPC, that means the reserve block pool is
empty, and whatever corruption the fuzzer introduced has produced
a filesystem that has zero space available to run transactions that
log recovery needs to run.

IOWs, if the fs is at ENOSPC, and the reserve pool is also empty,
then we can't run unlinked inode recovery or replay intents because
the transaction reservations will ENOSPC. If that's the case, then
we should be detecting the ENOSPC situation and aborting log
recovery rather than trying to recover and hitting random ENOSPC
failures part way through.

i.e. I'm trying to understand the cause of the ENOSPC issue, because
that will determine how we need to detect whatever on-disk
corruption the fuzzer created to trigger this issue.

-Dave.
--
Dave Chinner
dgc@xxxxxxxxxx