Re: [PATCH v2 07/13] xfs: Introduce FORCEALIGN inode flag
From: Dave Chinner
Date: Thu Jul 11 2024 - 19:20:40 EST
On Wed, Jul 10, 2024 at 07:59:58PM -0700, Darrick J. Wong wrote:
> On Fri, Jul 05, 2024 at 04:24:44PM +0000, John Garry wrote:
> > +/* Validate the forcealign inode flag */
> > +xfs_failaddr_t
> > +xfs_inode_validate_forcealign(
> > + struct xfs_mount *mp,
> > + uint32_t extsize,
> > + uint32_t cowextsize,
> > + uint16_t mode,
> > + uint16_t flags,
> > + uint64_t flags2)
> > +{
> > + bool rt = flags & XFS_DIFLAG_REALTIME;
> > +
> > + /* superblock rocompat feature flag */
> > + if (!xfs_has_forcealign(mp))
> > + return __this_address;
> > +
> > + /* Only regular files and directories */
> > + if (!S_ISDIR(mode) && !S_ISREG(mode))
> > + return __this_address;
> > +
> > + /* We require EXTSIZE or EXTSZINHERIT */
> > + if (!(flags & (XFS_DIFLAG_EXTSIZE | XFS_DIFLAG_EXTSZINHERIT)))
> > + return __this_address;
> > +
> > + /* We require a non-zero extsize */
> > + if (!extsize)
> > + return __this_address;
> > +
> > + /* Reflink'ed disallowed */
> > + if (flags2 & XFS_DIFLAG2_REFLINK)
> > + return __this_address;
>
> Hmm. If we don't support reflink + forcealign ATM, then shouldn't the
> superblock verifier or xfs_fs_fill_super fail the mount so that old
> kernels won't abruptly emit EFSCORRUPTED errors if a future kernel adds
> support for forcealign'd cow and starts writing out files with both
> iflags set?
I don't think we should error out the mount because reflink and
forcealign are enabled - that's going to be the common configuration
for every user of forcealign, right? I also don't think we should
throw a corruption error if both flags are set, either.
We're making an initial *implementation choice* not to implement the
two features on the same inode at the same time. We are not making a
an on-disk format design decision that says "these two on-disk flags
are incompatible".
IOWs, if both are set on a current kernel, it's not corruption but a
more recent kernel that supports both flags has modified this inode.
Put simply, we have detected a ro-compat situation for this specific
inode.
Looking at it as a ro-compat situation rather then corruption,
what I would suggest we do is this:
1. Warn at mount that reflink+force align inodes will be treated
as ro-compat inodes. i.e. read-only.
2. prevent forcealign from being set if the shared extent flag is
set on the inode.
3. prevent shared extents from being created if the force align flag
is set (i.e. ->remap_file_range() and anything else that relies on
shared extents will fail on forcealign inodes).
4. if we read an inode with both set, we emit a warning and force
the inode to be read only so we don't screw up the force alignment
of the file (i.e. that inode operates in ro-compat mode.)
#1 is the mount time warning of potential ro-compat behaviour.
#2 and #3 prevent both from getting set on existing kernels.
#4 is the ro-compat behaviour that would occur from taking a
filesystem that ran on a newer kernel that supports force-align+COW.
This avoids corruption shutdowns and modifications that would screw
up the alignment of the shared and COW'd extents.
> That said, if the bs>ps patchset lands, then I think forcealign cow is
> a simple matter of setting the min folio order to the forcealign size
> and making sure that we always write out entire folios if any of the
> blocks cached by the folio is shared. Direct writes to forcealigned
> shared files probably has to be aligned to the forcealign size or fall
> back to buffered writes for cow.
Right, I think all the pieces we will need are slowly falling into
place in the near future, so it doesn't seem right to me to actually
prevent filesystems with reflink and force-align both enabled right
now and then end up with a weird filesystem config needed to use
forcealign just for a couple of kernel releases...
-Dave.
--
Dave Chinner
david@xxxxxxxxxxxxx