Re: [PATCH v5 1/5] mm,memory_hotplug: Allocate memmap from the added memory range

From: David Hildenbrand
Date: Thu Mar 25 2021 - 11:41:49 EST


On 25.03.21 16:35, Michal Hocko wrote:
On Thu 25-03-21 16:19:36, David Hildenbrand wrote:
On 25.03.21 16:12, Michal Hocko wrote:
On Thu 25-03-21 15:46:22, David Hildenbrand wrote:
On 25.03.21 15:34, Michal Hocko wrote:
On Thu 25-03-21 15:09:35, David Hildenbrand wrote:
On 25.03.21 15:08, Michal Hocko wrote:
On Thu 25-03-21 13:40:45, David Hildenbrand wrote:
On 25.03.21 13:35, Michal Hocko wrote:
On Thu 25-03-21 12:08:43, David Hildenbrand wrote:
On 25.03.21 11:55, Oscar Salvador wrote:
[...]
- When moving the initialization/accounting to hot-add/hot-remove,
the section containing the vmemmap pages will remain offline.
It might get onlined once the pages get online in online_pages(),
or not if vmemmap pages span a whole section.
I remember (but maybe David rmemeber better) that that was a problem
wrt. pfn_to_online_page() and hybernation/kdump.
So, if that is really a problem, we would have to care of ot setting
the section to the right state.

Good memory. Indeed, hibernation/kdump won't save the state of the vmemmap,
because the memory is marked as offline and, thus, logically without any
valuable content.

^^^^ THIS


Could you point me to the respective hibernation code please? I always
get lost in that area. Anyway, we do have the same problem even if the
whole accounting is handled during {on,off}lining, no?

kernel/power/snapshot.c:saveable_page().

Thanks! So this is as I've suspected. The very same problem is present
if the memory block is marked offline. So we need a solution here
anyway. One way to go would be to consider these vmemmap pages always
online. pfn_to_online_page would have to special case them but we would
need to identify them first. I used to have PageVmemmap or something
like that in my early attempt to do this.

That being said this is not an argument for one or the other aproach.
Both need fixing.

Can you elaborate? What is the issue there? What needs fixing?

offline section containing vmemmap will be lost during hibernation cycle
IIU the above correctly.


Can tell me how that is a problem with Oscars current patch? I only see this
being a problem with what you propose - most probably I am missing something
important here.

Offline memory sections don't have a valid memmap (assumption: garbage). On
hibernation, the whole offline memory block won't be saved, including the
vmemmap content that resides on the block. This includes the vmemmap of the
vmemmap pages, which is itself.

When restoring, the whole memory block will contain garbage, including the
whole vmemmap - which is marked to be offline and to contain garbage.

Hmm, so I might be misunderstanding the restoring part. But doesn't that
mean that the whole section/memory block won't get restored because it
is offline and therefore the vmemmap would be pointing to nowhere?

AFAIU, only the content of the memory block won't be restored - whatever
memory content existed before the restore operation is kept.

The structures that define how the vmemmap should look like - the memory
sections and the page tables used for describing the vmemmap should properly
get saved+restored, as these are located on online memory.

So the vmemmap layout should look after restoring just like before saving.

OK, makes sense. Thanks for the clarification.

So there is indeed a difference. One way around that would be to mark
vmemmap pages (e.g. PageReserved && magic value stored somewhere in the
struct page - resembling bootmem vmemmaps) or mark section fully backing
vmemmaps as online (ugly).

I'm sorry Michal, but then we are hacking around the online section size limitation just in another (IMHO more ugly) way, then what Oscar and I came up with when discussing this in the past.

Your first approach would require us to look at potential garbage (pfn_to_online_page() == NULL) and filter out what might still be useful.

The second approach exposes garbage to the rest of the system as initialized memmap.


I honestly cannot say that I prefer either over what we have here.

--
Thanks,

David / dhildenb