Re: [PATCH v3 25/25] xfs: Support large folios

From: Matthew Wilcox
Date: Mon Jun 27 2022 - 10:10:48 EST


On Sun, Jun 26, 2022 at 09:15:27PM -0700, Darrick J. Wong wrote:
> On Wed, Jun 22, 2022 at 05:42:11PM -0700, Darrick J. Wong wrote:
> > [resend with shorter 522.out file to keep us under the 300k maximum]
> >
> > On Thu, Dec 16, 2021 at 09:07:15PM +0000, Matthew Wilcox (Oracle) wrote:
> > > Now that iomap has been converted, XFS is large folio safe.
> > > Indicate to the VFS that it can now create large folios for XFS.
> > >
> > > Signed-off-by: Matthew Wilcox (Oracle) <willy@xxxxxxxxxxxxx>
> > > Reviewed-by: Christoph Hellwig <hch@xxxxxx>
> > > Reviewed-by: Darrick J. Wong <djwong@xxxxxxxxxx>
> > > ---
> > > fs/xfs/xfs_icache.c | 2 ++
> > > 1 file changed, 2 insertions(+)
> > >
> > > diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
> > > index da4af2142a2b..cdc39f576ca1 100644
> > > --- a/fs/xfs/xfs_icache.c
> > > +++ b/fs/xfs/xfs_icache.c
> > > @@ -87,6 +87,7 @@ xfs_inode_alloc(
> > > /* VFS doesn't initialise i_mode or i_state! */
> > > VFS_I(ip)->i_mode = 0;
> > > VFS_I(ip)->i_state = 0;
> > > + mapping_set_large_folios(VFS_I(ip)->i_mapping);
> > >
> > > XFS_STATS_INC(mp, vn_active);
> > > ASSERT(atomic_read(&ip->i_pincount) == 0);
> > > @@ -320,6 +321,7 @@ xfs_reinit_inode(
> > > inode->i_rdev = dev;
> > > inode->i_uid = uid;
> > > inode->i_gid = gid;
> > > + mapping_set_large_folios(inode->i_mapping);
> >
> > Hmm. Ever since 5.19-rc1, I've noticed that fsx in generic/522 now
> > reports file corruption after 20 minutes of runtime. The corruption is
> > surprisingly reproducible (522.out.bad attached below) in that I ran it
> > three times and always got the same bad offset (0x6e000) and always the
> > same opcode (6213798(166 mod 256) MAPREAD).
> >
> > I turned off multipage folios and now 522 has run for over an hour
> > without problems, so before I go do more debugging, does this ring a
> > bell to anyone?
>
> I tried bisecting, but that didn't yield anything productive and
> 5.19-rc4 still fails after 25 minutes; however, it seems that g/522 will
> run without problems for at least 3-4 days after reverting this patch
> from -rc3.
>
> So I guess I have a blunt force fix if we can't figure this one out
> before 5.19 final, but I'd really rather not. Will keep trying this
> week.

I'm on holiday for the next week, so I'm not going to be able to spend
any time on this until then. I have a suspicion that this may be the
same bug Zorro is seeing here:

https://lore.kernel.org/linux-mm/20220613010850.6kmpenitmuct2osb@zlang-mailbox/

At least I hope it is, and finding a folio that has been freed would
explain (apparent) file corruption.