Re: [PATCH] dax, pmem: add support for msync

From: Dave Chinner
Date: Mon Aug 31 2015 - 19:38:13 EST


On Mon, Aug 31, 2015 at 12:59:44PM -0600, Ross Zwisler wrote:
> For DAX msync we just need to flush the given range using
> wb_cache_pmem(), which is now a public part of the PMEM API.

This is wrong, because it still leaves fsync() broken on dax.

Flushing dirty data to stable storage is the responsibility of the
writeback infrastructure, not the VMA/mm infrasrtucture. For non-dax
configurations, msync defers all that to vfs_fsync_range(), because
it has to be implemented there for fsync() to work.

Even for DAX, msync has to call vfs_fsync_range() for the filesystem to commit
the backing store allocations to stable storage, so there's not
getting around the fact msync is the wrong place to be flushing
DAX mappings to persistent storage.

I pointed this out almost 6 months ago (i.e. that fsync was broken)
anf hinted at how to solve it. Fix fsync, and msync gets fixed for
free:

https://lists.01.org/pipermail/linux-nvdimm/2015-March/000341.html

I've also reported to Willy that DAX write page faults don't work
correctly, either. xfstests generic/080 exposes this: a read
from a page followed immediately by a write to that page does not
result in ->page_mkwrite being called on the write and so
backing store is not allocated for the page, nor are the timestamps
for the file updated. This will also result in fsync (and msync)
not working properly.

Fix fsync first - if fsync works then msync will work.

Cheers,

Dave.
--
Dave Chinner
david@xxxxxxxxxxxxx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/