Re: [RFC PATCH 2/3] mm/memory_hotplug: Create __shrink_pages and move it to offline_pages

From: Oscar Salvador
Date: Wed Aug 08 2018 - 09:45:44 EST


On Wed, Aug 08, 2018 at 10:08:41AM +0200, David Hildenbrand wrote:
> Then it is maybe time to cleary distinguish both types of memory, as
> they are fundamentally different when it comes to online/offline behavior.
>
> Ordinary ram:
> add_memory ...
> online_pages ...
> offline_pages
> remove_memory
>
> Device memory
> add_device_memory ...
> remove_device_memory
>
> So adding/removing from the zone and stuff can be handled there.

Uhm, I have been thinking about this.
Maybe we could do something like (completely untested):


== memory_hotplug code ==

int add_device_memory(int nid, unsigned long start, unsigned long size,
struct vmem_altmap *altmap, bool mapping)
{
int ret;
unsigned long start_pfn = PHYS_PFN(start);
unsigned long nr_pages = size >> PAGE_SHIFT;

mem_hotplug_begin();
if (mapping)
ret = arch_add_memory(nid, start, size, altmap, false)
else
ret = add_pages(nid, start_pfn, nr_pages, altmap, false):

if (!ret) {
pgdata_t *pgdata = NODE_DATA(nid);
struct zone *zone = pgdata->node_zones[ZONE_DEVICE];

online_mem_sections(start_pfn, start_pfn + nr_pages);
move_pfn_range_to_zone(zone, start_pfn, nr_pages, altmap);
}
mem_hotplug_done();

return ret;
}

int del_device_memory(int nid, unsigned long start, unsigned long size,
struct vmem_altmap *altmap, bool mapping)
{
int ret;
unsigned long start_pfn = PHYS_PFN(start);
unsigned long nr_pages = size >> PAGE_SHIFT;
pgdata_t *pgdata = NODE_DATA(nid);
struct zone *zone = pgdata->node_zones[ZONE_DEVICE];

mem_hotplug_begin();

offline_mem_sections(start_pfn, start_pfn + nr_pages);
__shrink_pages(zone, start_pfn, start_pfn + nr_pages, nr_pages);

if (mapping)
ret = arch_remove_memory(nid, start, size, altmap)
else
ret = __remove_pages(nid, start_pfn, nr_pages, altmap)

mem_hotplug_done();

return ret;
}

===

And then, HMM/devm code could use it.

For example:

hmm_devmem_pages_create():

...
...
if (devmem->pagemap.type == MEMORY_DEVICE_PUBLIC)
linear_mapping = true;
else
linear_mapping = false;

ret = add_device_memory(nid, align_start, align_size, NULL, linear_mapping);
if (ret)
goto error_add_memory;
...
...


hmm_devmem_release:

...
...
if (resource->desc == IORES_DESC_DEVICE_PRIVATE_MEMORY)
mapping = false;
else
mapping = true;

del_device_memory(nid, start_pfn << PAGE_SHIFT, npages << PAGE_SHIFT,
NULL,
mapping);
...
...


In this way, we do not need to play tricks in HMM/devm code, we just need to
call those functions when adding/removing memory.

We would still have to figure out a way to go for the release_mem_region_adjustable() stuff though.

Thanks
--
Oscar Salvador
SUSE L3