Re: [RFC PATCH 2/2] mm, fs: daxfile, an interface for byte-addressable updates to pmem
From: Dave Chinner
Date: Tue Jun 20 2017 - 19:53:55 EST
On Tue, Jun 20, 2017 at 09:17:36AM -0700, Dan Williams wrote:
> On Tue, Jun 20, 2017 at 1:49 AM, Christoph Hellwig <hch@xxxxxx> wrote:
> > [stripped giant fullquotes]
> >
> > On Mon, Jun 19, 2017 at 10:53:12PM -0700, Andy Lutomirski wrote:
> >> But that's my whole point. The kernel doesn't really need to prevent
> >> all these background maintenance operations -- it just needs to block
> >> .page_mkwrite until they are synced. I think that whatever new
> >> mechanism we add for this should be sticky, but I see no reason why
> >> the filesystem should have to block reflink on a DAX file entirely.
> >
> > Agreed - IFF we want to support write through semantics this is the
> > only somewhat feasible way. It still has massive downsides of forcing
> > the full sync machinery to run from the page fauly handler, which
> > I'm rather scared off, but that's still better than creating a magic
> > special case that isn't managable at all.
>
> An immutable-extent DAX-file and a reflink-capable DAX-file are not
> mutually exclusive,
Actually, they are mutually exclusive: when the immutable extent DAX
inode is breaking the extent sharing done during the reflink
operation, the copy-on-write operation requires allocating and
freeing extents on the inode that has immutable extents. Which, if
the inode really has immutable extents, cannot be done.
That said, if the extent sharing is broken on the other side of the
reflink (i.e. the non-immutable inode created by the reflink) then
the extent map of the inode with immutable extents will remain
unchanged. i.e. there are two sides to this, and if you only see one
side you might come to the wrong conclusion.
However, we cannot guarantee that no writes occur to the inode with
immutable extent maps (especially as the whole point is to allow
userspace writes and commits without the kernel being involved), so
extent sharing on immutable extent maps cannot be allowed...
Cheers,
Dave.
--
Dave Chinner
david@xxxxxxxxxxxxx