Re: [PATCH v9 7/8] xfs: support CoW in fsdax mode

From: Darrick J. Wong
Date: Fri Sep 17 2021 - 11:34:52 EST


On Thu, Sep 16, 2021 at 08:32:51AM +0200, Christoph Hellwig wrote:
> On Wed, Sep 15, 2021 at 05:22:27PM -0700, Darrick J. Wong wrote:
> > > xfs_ilock(XFS_I(inode), XFS_MMAPLOCK_SHARED);
> > > ret = dax_iomap_fault(vmf, pe_size, &pfn, NULL,
> > > (write_fault && !vmf->cow_page) ?
> > > - &xfs_direct_write_iomap_ops :
> > > - &xfs_read_iomap_ops);
> > > + &xfs_dax_write_iomap_ops :
> > > + &xfs_read_iomap_ops);
> >
> > Hmm... I wonder if this should get hoisted to a "xfs_dax_iomap_fault"
> > wrapper like you did for xfs_iomap_zero_range?
>
> This has just a single users, so the classic argument won't apply. That
> being said __xfs_filemap_fault is a complete mess to due the calling
> conventions of the various VFS methods multiplexed into it. So yes,
> splitting out a xfs_dax_iomap_fault to wrap the above plus the
> dax_finish_sync_fault call might not actually be a bad idea nevertheless.

Agree.

> > > + struct xfs_inode *ip = XFS_I(inode);
> > > + /*
> > > + * Usually we use @written to indicate whether the operation was
> > > + * successful. But it is always positive or zero. The CoW needs the
> > > + * actual error code from actor(). So, get it from
> > > + * iomap_iter->processed.
> >
> > Hm. All six arguments are derived from the struct iomap_iter, so maybe
> > it makes more sense to pass that in? I'll poke around with this more
> > tomorrow.
>
> I'd argue against just changing the calling conventions for ->iomap_end
> now. The original iter patches from willy allowed passing a single
> next callback combinging iomap_begin and iomap_end in a way that with
> a little magic we can avoid the indirect calls entirely. I think we'll
> need to experiment with that that a bit and see if is worth the effort
> first. I plan to do that but I might not get to it immediate. If some
> else wants to take over I'm fine with that.

Ah, I forgot that. Yay Etch-a-Sketch brain. <shake> -ENODATA ;)

> > > static int
> > > xfs_buffered_write_iomap_begin(
> >
> > Also, we have an related request to drop the EXPERIMENTAL tag for
> > non-DAX reflink. Whichever patch enables dax+reflink for xfs needs to
> > make it clear that reflink + any possibility of DAX emits an
> > EXPERIMENTAL warning.
>
> More importantly before we can merge this series we also need the VM
> level support for reflink-aware reverse mapping. So while this series
> here is no in a good enough shape I don't see how we could merge it
> without that other series as we'd have to disallow mmap for reflink+dax
> files otherwise.

I've forgotten why we need mm level reverse mapping again? The pmem
poison stuff can use ->media_failure (or whatever it was called,
memory_failure?) to find all the owners and notify them. Was there
some other accounting reason that fell out of my brain?

I'm more afraid of 'sharing pages between files needs mm support'
sparking another multi-year folioesque fight with the mm people.

--D