Re: [PATCH] xfs: use GFP_NOFS in __xfs_trans_alloc

From: Christoph Hellwig

Date: Mon Mar 16 2026 - 05:18:40 EST

On Fri, Mar 13, 2026 at 07:25:05AM +1100, Dave Chinner wrote:
> On Thu, Mar 12, 2026 at 03:22:14PM +0800, Morduan Zang wrote:
> > __xfs_trans_alloc() allocates the transaction structure before
> > xfs_trans_set_context() establishes the nofs context. If memory reclaim
> > enters XFS through xfs_vn_sync_lazytime(), this GFP_KERNEL allocation can
> > trigger a warning from the reclaim path.
>
> PLease include the warning and stack trace in the commit message.
>
> > Use GFP_NOFS for the transaction allocation to avoid filesystem reclaim
> > recursion before the nofs context is set.
> >
> > Link: https://syzkaller.appspot.com/bug?extid=d78ace33ad4ee69329d5
>
> That's a PF_MEMALLOC + __GFP_NOFAIL warning. Has nothing to do
> with GFP_NOFS.

Yes.

> Indeed, the stack trace trivially demonstrates the cause - the
> sync_lazytime() changes (in 6.19i, IIRC) have put a new XFS
> transaction in the iput() path that memory reclaim runs.

The lazytime changes (in 7.0-rc). And I think they do indeed cause
this because we fail to clear I_DIRTY_TIME for some cases.

> We managed to remove all the xfs transactions in this path with the
> introduction of the background inodegc infrastructure because
> lockdep, memory allocation and other stuff really don't like us
> running "must succeed" transactions in the memory reclaim path.
>
> Hence putting a new transaction directly in that path is a
> regression, and so I suspect the sync_lazytime() call directly from
> iput() running a transaction needs to be rethought...

Not a new transaction, but one we didn't hit before. That being said,
doing this separate syncing of the dirty time vs just batching it with
the write_inode_now in iput_final looks really odd to me. This goes back
to Ted's original commit 0ae45f63d4ef8 adding laztime more than 10 years
ago, which unfortunately does not explain the rationale.