Re: [PATCH 0/4] mm,memory_hotplug: allocate memmap from hotadded memory

From: Oscar Salvador
Date: Fri Mar 29 2019 - 04:45:51 EST


On Thu, Mar 28, 2019 at 04:31:44PM +0100, David Hildenbrand wrote:
> Correct me if I am wrong. I think I was confused - vmemmap data is still
> allocated *per memory block*, not for the whole added memory, correct?

No, vmemap data is allocated per memory-resource added.
In case a DIMM, would be a DIMM, in case a qemu memory-device, would be that
memory-device.
That is counting that ACPI does not split the DIMM/memory-device in several memory
resources.
If that happens, then acpi_memory_enable_device() calls __add_memory for every
memory-resource, which means that the vmemmap data will be allocated per
memory-resource.
I did not see this happening though, and I am not sure under which circumstances
can happen (I have to study the ACPI code a bit more).

The problem with allocating vmemmap data per memblock, is the fragmentation.
Let us say you do the following:

* memblock granularity 128M

(qemu) object_add memory-backend-ram,id=ram0,size=256M
(qemu) device_add pc-dimm,id=dimm0,memdev=ram0,node=1

This will create two memblocks (2 sections), and if we allocate the vmemmap
data for each corresponding section within it section(memblock), you only get
126M contiguous memory.

So, the taken approach is to allocate the vmemmap data corresponging to the
whole DIMM/memory-device/memory-resource from the beginning of its memory.

In the example from above, the vmemmap data for both sections is allocated from
the beginning of the first section:

memmap array takes 2MB per section, so 512 pfns.
If we add 2 sections:

[ pfn#0 ] \
[ ... ] | vmemmap used for memmap array
[pfn#1023 ] /

[pfn#1024 ] \
[ ... ] | used as normal memory
[pfn#65536] /

So, out of 256M, we get 252M to use as a real memory, as 4M will be used for
building the memmap array.

Actually, it can happen that depending on how big a DIMM/memory-device is,
the first/s memblock is fully used for the memmap array (of course, this
can only be seen when adding a huge DIMM/memory-device).

--
Oscar Salvador
SUSE L3