[PATCH v2 0/1] memory_hotplug: fix the panic when memory end is not

From: Mikhail Zaslonko
Date: Mon Nov 05 2018 - 10:06:21 EST


This patch refers to the older thread:
https://marc.info/?t=153658306400001&r=1&w=2

I have tried to take the approaches suggested in the discussion like
simply ignoring unaligned memory to section memory much earlier or
initializing struct pages beyond the "end" but both had issues.

First I tried to ignore unaligned memory early by adjusting memory_end
value. But the thing is that kernel mem parameter parsing and memory_end
calculation take place in the architecture code and adjusting it afterwards
in common code might be too late in my view. Also with this approach we
might lose the memory up to the entire section(256Mb on s390) just because
of unfortunate alignment.

Another approach was to fix memmap_init() and initialize struct pages
beyond the end. Since struct pages are allocated section-wise we can try to
round the size parameter passed to the memmap_init() function up to the
section boundary thus forcing the mapping initialization for the entire
section. But then it leads to another VM_BUG_ON panic due to
zone_spans_pfn() sanity check triggered for the first page of each page
block from set_pageblock_migratetype() function:
page dumped because: VM_BUG_ON_PAGE(!zone_spans_pfn(page_zone(page), pfn))
Call Trace:
([<00000000003013f8>] set_pfnblock_flags_mask+0xe8/0x140)
[<00000000003014aa>] set_pageblock_migratetype+0x5a/0x70
[<0000000000bef706>] memmap_init_zone+0x25e/0x2e0
[<00000000010fc3d8>] free_area_init_node+0x530/0x558
[<00000000010fcf02>] free_area_init_nodes+0x81a/0x8f0
[<00000000010e7fdc>] paging_init+0x124/0x130
[<00000000010e4dfa>] setup_arch+0xbf2/0xcc8
[<00000000010de9e6>] start_kernel+0x7e/0x588
[<000000000010007c>] startup_continue+0x7c/0x300
Last Breaking-Event-Address:
[<00000000003013f8>] set_pfnblock_flags_mask+0xe8/0x1401
We might ignore this check for the struct pages beyond the "end" but I'm not
sure about further implications.
For now I suggest to stay with my original proposal fixing specific
functions for memory hotplug sysfs handlers.

Changes v1 -> v2:
* Expanded commit message to show both failing scenarious.
* Use 'pfn + i' instead of 'pfn' for zone_spans_pfn() check within
test_pages_in_a_zone() function thus taking CONFIG_HOLES_IN_ZONE into
consideration.

Mikhail Zaslonko (1):
memory_hotplug: fix the panic when memory end is not on the section
boundary

mm/memory_hotplug.c | 20 +++++++++++---------
1 file changed, 11 insertions(+), 9 deletions(-)

--
2.16.4