Re: [HMM 07/15] mm/ZONE_DEVICE: new type of ZONE_DEVICE for unaddressable memory v3

From: Jerome Glisse
Date: Tue May 30 2017 - 17:43:52 EST


On Tue, May 30, 2017 at 10:43:55AM -0600, Ross Zwisler wrote:
> On Wed, May 24, 2017 at 01:20:16PM -0400, Jérôme Glisse wrote:
> > HMM (heterogeneous memory management) need struct page to support migration
> > from system main memory to device memory. Reasons for HMM and migration to
> > device memory is explained with HMM core patch.
> >
> > This patch deals with device memory that is un-addressable memory (ie CPU
> > can not access it). Hence we do not want those struct page to be manage
> > like regular memory. That is why we extend ZONE_DEVICE to support different
> > types of memory.
> >
> > A persistent memory type is define for existing user of ZONE_DEVICE and a
> > new device un-addressable type is added for the un-addressable memory type.
> > There is a clear separation between what is expected from each memory type
> > and existing user of ZONE_DEVICE are un-affected by new requirement and new
> > use of the un-addressable type. All specific code path are protect with
> > test against the memory type.
> >
> > Because memory is un-addressable we use a new special swap type for when
> > a page is migrated to device memory (this reduces the number of maximum
> > swap file).
> >
> > The main two additions beside memory type to ZONE_DEVICE is two callbacks.
> > First one, page_free() is call whenever page refcount reach 1 (which means
> > the page is free as ZONE_DEVICE page never reach a refcount of 0). This
> > allow device driver to manage its memory and associated struct page.
> >
> > The second callback page_fault() happens when there is a CPU access to
> > an address that is back by a device page (which are un-addressable by the
> > CPU). This callback is responsible to migrate the page back to system
> > main memory. Device driver can not block migration back to system memory,
> > HMM make sure that such page can not be pin into device memory.
> >
> > If device is in some error condition and can not migrate memory back then
> > a CPU page fault to device memory should end with SIGBUS.
> >
> > Changed since v2:
> > - s/DEVICE_UNADDRESSABLE/DEVICE_PRIVATE
> > Changed since v1:
> > - rename to device private memory (from device unaddressable)
> >
> > Signed-off-by: Jérôme Glisse <jglisse@xxxxxxxxxx>
> > Acked-by: Dan Williams <dan.j.williams@xxxxxxxxx>
> > Cc: Ross Zwisler <ross.zwisler@xxxxxxxxxxxxxxx>
> > ---
> <>
> > @@ -35,18 +37,88 @@ static inline struct vmem_altmap *to_vmem_altmap(unsigned long memmap_start)
> > }
> > #endif
> >
> > +/*
> > + * Specialize ZONE_DEVICE memory into multiple types each having differents
> > + * usage.
> > + *
> > + * MEMORY_DEVICE_PUBLIC:
> > + * Persistent device memory (pmem): struct page might be allocated in different
> > + * memory and architecture might want to perform special actions. It is similar
> > + * to regular memory, in that the CPU can access it transparently. However,
> > + * it is likely to have different bandwidth and latency than regular memory.
> > + * See Documentation/nvdimm/nvdimm.txt for more information.
> > + *
> > + * MEMORY_DEVICE_PRIVATE:
> > + * Device memory that is not directly addressable by the CPU: CPU can neither
> > + * read nor write _UNADDRESSABLE memory. In this case, we do still have struct
> _PRIVATE
>
> Just noticed that one holdover from the DEVICE_UNADDRESSABLE naming.
>

Thanks for catching that, Andrew can you change it yourself to _PRIVATE
s/_UNADDRESSABLE/_PRIVATE

Or should i repost fixed patch ?

Cheers,
Jérôme