Re: [PATCH v2 0/2] DAX bdev fixes - move flushing calls to FS

From: Dave Chinner
Date: Fri Feb 12 2016 - 21:40:05 EST


On Fri, Feb 12, 2016 at 12:03:20PM -0700, Ross Zwisler wrote:
> On Thu, Feb 11, 2016 at 01:43:04PM +0100, Jan Kara wrote:
> > On Wed 10-02-16 13:48:54, Ross Zwisler wrote:
> > > 3) In filemap_write_and_wait() and filemap_write_and_wait_range(), continue
> > > the writeback in the case that DAX is enabled but we only have a nonzero
> > > mapping->nrpages. As with 1) and 2), I believe this is necessary to
> > > properly writeback metadata changes. If this sounds wrong, please let me
> > > know and I'll get more info.
> >
> > And I'm surprised here as well. If there are dax_mapping() inodes that have
> > pagecache pages, then we have issues with radix tree handling as well. So
> > how come dax_mapping() inodes have pages attached? If it is about block
> > device inodes, then I find it buggy, that S_DAX gets set for such inodes
> > when filesystem is mounted on them because in such cases we are IMO asking
> > for data corruption sooner rather than later...
>
> I think I've figured this one out, at least partially.
>
> For ext2 the issues I was seeing were due to the fact that directory inodes
> have S_DAX set, but have dirty page cache pages. In testing with
> generic/002, I see two ext2 inodes with S_DAX trying to do a writeback while
> they have dirty page cache pages. The first has i_ino=2, which is the
> EXT2_ROOT_INO.
....
> As far as I can see, XFS does not have these issues - returning immediately
> having done just the DAX writeback in xfs_vm_writepages() lets all my xfstests
> pass.

XFS will not have issues because it does not dirty directory inodes
at the VFS level, nor does it use the page cache for directory data.
However, looking at the code I think it does still set S_DAX on
directory inodes, which it shouldn't be doing.

I've got a couple of fixes I need to do in this area - hopefully
I'll get it done on Monday.

Cheers,

Dave.
--
Dave Chinner
david@xxxxxxxxxxxxx