Re: [PATCH] mm/page_alloc: fix boot hang in memmap_init_zone

From: Jia He
Date: Thu Mar 15 2018 - 20:45:53 EST




On 3/15/2018 11:39 PM, Daniel Vacek Wrote:
On Thu, Mar 15, 2018 at 3:08 PM, Jia He <hejianet@xxxxxxxxx> wrote:
Hi Daniel



On 3/14/2018 6:42 AM, Daniel Vacek Wrote:
On some architectures (reported on arm64) commit 864b75f9d6b01
("mm/page_alloc: fix memmap_init_zone pageblock alignment")
causes a boot hang. This patch fixes the hang making sure the alignment
never steps back.

Link:
http://lkml.kernel.org/r/0485727b2e82da7efbce5f6ba42524b429d0391a.1520011945.git.neelx@xxxxxxxxxx
Fixes: 864b75f9d6b01 ("mm/page_alloc: fix memmap_init_zone pageblock
alignment")
Signed-off-by: Daniel Vacek <neelx@xxxxxxxxxx>
Tested-by: Sudeep Holla <sudeep.holla@xxxxxxx>
Tested-by: Naresh Kamboju <naresh.kamboju@xxxxxxxxxx>
Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
Cc: Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx>
Cc: Michal Hocko <mhocko@xxxxxxxx>
Cc: Paul Burton <paul.burton@xxxxxxxxxx>
Cc: Pavel Tatashin <pasha.tatashin@xxxxxxxxxx>
Cc: Vlastimil Babka <vbabka@xxxxxxx>
Cc: <stable@xxxxxxxxxxxxxxx>
---
mm/page_alloc.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 3d974cb2a1a1..e033a6895c6f 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5364,9 +5364,14 @@ void __meminit memmap_init_zone(unsigned long size,
int nid, unsigned long zone,
* is not. move_freepages_block() can shift ahead
of
* the valid region but still depends on correct
page
* metadata.
+ * Also make sure we never step back.
*/
- pfn = (memblock_next_valid_pfn(pfn, end_pfn) &
+ unsigned long next_pfn;
+
+ next_pfn = (memblock_next_valid_pfn(pfn, end_pfn)
&
~(pageblock_nr_pages-1)) - 1;
+ if (next_pfn > pfn)
+ pfn = next_pfn;
It didn't resolve the booting hang issue in my arm64 server.
what if memblock_next_valid_pfn(pfn, end_pfn) is 32 and pageblock_nr_pages
is 8196?
Thus, next_pfn will be (unsigned long)-1 and be larger than pfn.
So still there is an infinite loop here.
Hi Jia,

Yeah, looks like another uncovered case. Noone reported this so far.
Anyways upstream reverted all this for now and we're discussing the
right approach here.

In any case thanks for this report. Can you share something like below
from your machine?
sure.
[ÂÂÂ 0.000000] NUMA: Faking a node at [mem 0x0000000000000000-0x00000017ffffffff]
[ÂÂÂ 0.000000] NUMA: NODE_DATA [mem 0x17ffffcb80-0x17ffffffff]
[ÂÂÂ 0.000000] Zone ranges:
[ÂÂÂ 0.000000]ÂÂ DMA32ÂÂÂ [mem 0x0000000000200000-0x00000000ffffffff]
[ÂÂÂ 0.000000]ÂÂ NormalÂÂ [mem 0x0000000100000000-0x00000017ffffffff]
[ÂÂÂ 0.000000] Movable zone start for each node
[ÂÂÂ 0.000000] Early memory node ranges
[ÂÂÂ 0.000000]ÂÂ nodeÂÂ 0: [mem 0x0000000000200000-0x000000000021ffff]
[ÂÂÂ 0.000000]ÂÂ nodeÂÂ 0: [mem 0x0000000000820000-0x000000000307ffff]
[ÂÂÂ 0.000000]ÂÂ nodeÂÂ 0: [mem 0x0000000003080000-0x000000000308ffff]
[ÂÂÂ 0.000000]ÂÂ nodeÂÂ 0: [mem 0x0000000003090000-0x00000000031fffff]
[ÂÂÂ 0.000000]ÂÂ nodeÂÂ 0: [mem 0x0000000003200000-0x00000000033fffff]
[ÂÂÂ 0.000000]ÂÂ nodeÂÂ 0: [mem 0x0000000003410000-0x000000000563ffff]
[ÂÂÂ 0.000000]ÂÂ nodeÂÂ 0: [mem 0x0000000005640000-0x000000000567ffff]
[ÂÂÂ 0.000000]ÂÂ nodeÂÂ 0: [mem 0x0000000005680000-0x00000000056dffff]
[ÂÂÂ 0.000000]ÂÂ nodeÂÂ 0: [mem 0x00000000056e0000-0x00000000086fffff]
[ÂÂÂ 0.000000]ÂÂ nodeÂÂ 0: [mem 0x0000000008700000-0x000000000871ffff]
[ÂÂÂ 0.000000]ÂÂ nodeÂÂ 0: [mem 0x0000000008720000-0x000000000894ffff]
[ÂÂÂ 0.000000]ÂÂ nodeÂÂ 0: [mem 0x0000000008950000-0x0000000008baffff]
[ÂÂÂ 0.000000]ÂÂ nodeÂÂ 0: [mem 0x0000000008bb0000-0x0000000008bcffff]
[ÂÂÂ 0.000000]ÂÂ nodeÂÂ 0: [mem 0x0000000008bd0000-0x0000000008c4ffff]
[ÂÂÂ 0.000000]ÂÂ nodeÂÂ 0: [mem 0x0000000008c50000-0x0000000008e2ffff]
[ÂÂÂ 0.000000]ÂÂ nodeÂÂ 0: [mem 0x0000000008e30000-0x0000000008e4ffff]
[ÂÂÂ 0.000000]ÂÂ nodeÂÂ 0: [mem 0x0000000008e50000-0x0000000008fcffff]
[ÂÂÂ 0.000000]ÂÂ nodeÂÂ 0: [mem 0x0000000008fd0000-0x000000000910ffff]
[ÂÂÂ 0.000000]ÂÂ nodeÂÂ 0: [mem 0x0000000009110000-0x00000000092effff]
[ÂÂÂ 0.000000]ÂÂ nodeÂÂ 0: [mem 0x00000000092f0000-0x000000000930ffff]
[ÂÂÂ 0.000000]ÂÂ nodeÂÂ 0: [mem 0x0000000009310000-0x000000000963ffff]
[ÂÂÂ 0.000000]ÂÂ nodeÂÂ 0: [mem 0x0000000009640000-0x000000000e61ffff]
[ÂÂÂ 0.000000]ÂÂ nodeÂÂ 0: [mem 0x000000000e620000-0x000000000e64ffff]
[ÂÂÂ 0.000000]ÂÂ nodeÂÂ 0: [mem 0x000000000e650000-0x000000000fffffff]
[ÂÂÂ 0.000000]ÂÂ nodeÂÂ 0: [mem 0x0000000010800000-0x0000000017feffff]
[ÂÂÂ 0.000000]ÂÂ nodeÂÂ 0: [mem 0x000000001c000000-0x000000001c00ffff]
[ÂÂÂ 0.000000]ÂÂ nodeÂÂ 0: [mem 0x000000001c010000-0x000000001c7fffff]
[ÂÂÂ 0.000000]ÂÂ nodeÂÂ 0: [mem 0x000000001c810000-0x000000007efbffff]
[ÂÂÂ 0.000000]ÂÂ nodeÂÂ 0: [mem 0x000000007efc0000-0x000000007efdffff]
[ÂÂÂ 0.000000]ÂÂ nodeÂÂ 0: [mem 0x000000007efe0000-0x000000007efeffff]
[ÂÂÂ 0.000000]ÂÂ nodeÂÂ 0: [mem 0x000000007eff0000-0x000000007effffff]
[ÂÂÂ 0.000000]ÂÂ nodeÂÂ 0: [mem 0x000000007f000000-0x00000017ffffffff]
[ÂÂÂ 0.000000] Initmem setup node 0 [mem 0x0000000000200000-0x00000017ffffffff]

--
Cheers,
Jia