Re: [Linux-nvdimm] [RFC PATCH 0/7] evacuate struct page from the block layer

From: Dan Williams
Date: Thu Mar 19 2015 - 16:59:36 EST


On Thu, Mar 19, 2015 at 12:59 PM, Andrew Morton
<akpm@xxxxxxxxxxxxxxxxxxxx> wrote:
> On Thu, 19 Mar 2015 17:54:15 +0200 Boaz Harrosh <boaz@xxxxxxxxxxxxx> wrote:
>
>> On 03/19/2015 03:43 PM, Matthew Wilcox wrote:
>> <>
>> >
>> > Dan missed "Support O_DIRECT to a mapped DAX file". More generally, if we
>> > want to be able to do any kind of I/O directly to persistent memory,
>> > and I think we do, we need to do one of:
>> >
>> > 1. Construct struct pages for persistent memory
>> > 1a. Permanently
>> > 1b. While the pages are under I/O
>> > 2. Teach the I/O layers to deal in PFNs instead of struct pages
>> > 3. Replace struct page with some other structure that can represent both
>> > DRAM and PMEM
>> >
>> > I'm personally a fan of #3, and I was looking at the scatterlist as
>> > my preferred data structure. I now believe the scatterlist as it is
>> > currently defined isn't sufficient, so we probably end up needing a new
>> > data structure. I think Dan's preferred method of replacing struct
>> > pages with PFNs is actually less instrusive, but doesn't give us as
>> > much advantage (an entirely new data structure would let us move to an
>> > extent based system at the same time, instead of sticking with an array
>> > of pages). Clearly Boaz prefers 1a, which works well enough for the
>> > 8GB NV-DIMMs, but not well enough for the 400GB NV-DIMMs.
>> >
>> > What's your preference? I guess option 0 is "force all I/O to go
>> > through the page cache and then get copied", but that feels like a nasty
>> > performance hit.
>>
>> Thanks Matthew, you have summarized it perfectly.
>>
>> I think #1b might have merit, as well.
>
> It would be interesting to see what a 1b implementation looks like and
> how it performs. We already allocate a bunch of temporary things to
> support in-flight IO (bio, request) and allocating pageframes on the
> same basis seems a fairly logical fit.

At least for block-i/o it seems the only place we really need struct
page infrastructure is for kmap(). Given we already need a kmap_pfn()
solution for option 2 a "dynamic allocation" stop along that
development path may just naturally fall out.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/