Re: [PATCH 1/3] memremap.c : Add support for ZONE_DEVICE IO memory with struct pages.

From: Dan Williams
Date: Wed Oct 19 2016 - 16:02:39 EST


On Wed, Oct 19, 2016 at 11:40 AM, Stephen Bates <sbates@xxxxxxxxxxxx> wrote:
> On Wed, Oct 19, 2016 at 10:50:25AM -0700, Dan Williams wrote:
>> On Tue, Oct 18, 2016 at 2:42 PM, Stephen Bates <sbates@xxxxxxxxxxxx> wrote:
>> > From: Logan Gunthorpe <logang@xxxxxxxxxxxx>
>> >
>> > We build on recent work that adds memory regions owned by a device
>> > driver (ZONE_DEVICE) [1] and to add struct page support for these new
>> > regions of memory [2].
>> >
>> > 1. Add an extra flags argument into dev_memremap_pages to take in a
>> > MEMREMAP_XX argument. We update the existing calls to this function to
>> > reflect the change.
>> >
>> > 2. For completeness, we add MEMREMAP_WT support to the memremap;
>> > however we have no actual need for this functionality.
>> >
>> > 3. We add the static functions, add_zone_device_pages and
>> > remove_zone_device pages. These are similar to arch_add_memory except
>> > they don't create the memory mapping. We don't believe these need to be
>> > made arch specific, but are open to other opinions.
>> >
>> > 4. dev_memremap_pages and devm_memremap_pages_release are updated to
>> > treat IO memory slightly differently. For IO memory we use a combination
>> > of the appropriate io_remap function and the zone_device pages functions
>> > created above. A flags variable and kaddr pointer are added to struct
>> > page_mem to facilitate this for the release function. We also set up
>> > the page attribute tables for the mapped region correctly based on the
>> > desired mapping.
>> >
>>
>> This description says "what" is being done, but not "why".
>
> Hi Dan
>
> We discuss the motivation in the cover letter.
>
>>
>> In the cover letter, "[PATCH 0/3] iopmem : A block device for PCIe
>> memory", it mentions that the lack of I/O coherency is a known issue
>> and users of this functionality need to be cognizant of the pitfalls.
>> If that is the case why do we need support for different cpu mapping
>> types than the default write-back cache setting? It's up to the
>> application to handle cache cpu flushing similar to what we require of
>> device-dax users in the persistent memory case.
>
> Some of the iopmem hardware we have tested has certain alignment
> restrictions on BAR accesses. At the very least we require write
> combine mappings for these. We then felt it appropriate to add the
> other mappings for the sake of completeness.

If the device can support write-combine then it can support bursts, so
I wonder why it couldn't support read bursts for cache fills...