Re: [PATCH 00/13] dax, pmem: move cpu cache maintenance to libnvdimm
From: Dan Williams
Date: Mon Jan 23 2017 - 13:31:28 EST
On Mon, Jan 23, 2017 at 10:03 AM, Christoph Hellwig <hch@xxxxxx> wrote:
> On Mon, Jan 23, 2017 at 09:14:04AM -0800, Dan Williams wrote:
>> The use case that we have now is distinguishing volatile vs persistent
>> memory (brd vs pmem).
>
> brd is a development tool, so until we have other reasons for this
> abstraction (which I'm pretty sure will show up rather sooner than later)
> I would not worry about it too much.
By "volatile" I also meant cases where pmem is fronting volatile
memory, or more importantly when the platform has otherwise arranged
for cpu caches to be flushed on a power loss event like I believe some
existing storage appliances do.
>> I took a look at mtd layering approach and the main difference is that
>> layers above the block layer do not appear to know anything about mtd
>> specifics.
>
> Or the block layer itself for that matter. And that's exactly where
> I want DAX to be in the future.
>
>> For fs/dax.c we currently need some path to retrieve a dax
>> anchor object through the block device.
>
> We have a need to retreiver the anchor object. We currently do it
> though the block layer for historical reasons, but it doesn't have
> to be that way.
>
>> > In the longer run I like your dax_operations, but they need to be
>> > separate from the block layer.
>>
>> I'll move them from block_device_operations to dax data hanging off of
>> the bdev_inode, or is there a better way to go from bdev-to-dax?
>
> I don't think that's any better. What we really want is a way
> to find the underlying persistent memory / DAX / whatever we call
> it node without going through a block device. E.g. a library function
> to give that object for a given path name, where the path name could
> be either that of the /dev/pmemN or the /dev/daxN device.
>
> If the file system for now still needs a block device as well it
> will only accept the /dev/pmemN name, and open both the low-level
> pmem device and the block device. Once that file system doesn't
> need block code (and I think we could do that easily for XFS,
> nevermind any new FS) it won't have to deal with the block
> device at all.
>
> pmem.c then becomes a consumer of the dax_ops just like the file system.
Ah ok, I'll take a look at a dax_by_path() capability.