Re: [PATCH 00/13] dax, pmem: move cpu cache maintenance to libnvdimm

From: Christoph Hellwig
Date: Mon Jan 23 2017 - 13:03:26 EST


On Mon, Jan 23, 2017 at 09:14:04AM -0800, Dan Williams wrote:
> The use case that we have now is distinguishing volatile vs persistent
> memory (brd vs pmem).

brd is a development tool, so until we have other reasons for this
abstraction (which I'm pretty sure will show up rather sooner than later)
I would not worry about it too much.

> I took a look at mtd layering approach and the main difference is that
> layers above the block layer do not appear to know anything about mtd
> specifics.

Or the block layer itself for that matter. And that's exactly where
I want DAX to be in the future.

> For fs/dax.c we currently need some path to retrieve a dax
> anchor object through the block device.

We have a need to retreiver the anchor object. We currently do it
though the block layer for historical reasons, but it doesn't have
to be that way.

> > In the longer run I like your dax_operations, but they need to be
> > separate from the block layer.
>
> I'll move them from block_device_operations to dax data hanging off of
> the bdev_inode, or is there a better way to go from bdev-to-dax?

I don't think that's any better. What we really want is a way
to find the underlying persistent memory / DAX / whatever we call
it node without going through a block device. E.g. a library function
to give that object for a given path name, where the path name could
be either that of the /dev/pmemN or the /dev/daxN device.

If the file system for now still needs a block device as well it
will only accept the /dev/pmemN name, and open both the low-level
pmem device and the block device. Once that file system doesn't
need block code (and I think we could do that easily for XFS,
nevermind any new FS) it won't have to deal with the block
device at all.

pmem.c then becomes a consumer of the dax_ops just like the file system.