Re: [PATCH] mm/page_alloc: fix boot hang in memmap_init_zone
From: Daniel Vacek
Date: Thu Mar 15 2018 - 11:39:14 EST
On Thu, Mar 15, 2018 at 3:08 PM, Jia He <hejianet@xxxxxxxxx> wrote:
> Hi Daniel
>
>
>
> On 3/14/2018 6:42 AM, Daniel Vacek Wrote:
>>
>> On some architectures (reported on arm64) commit 864b75f9d6b01
>> ("mm/page_alloc: fix memmap_init_zone pageblock alignment")
>> causes a boot hang. This patch fixes the hang making sure the alignment
>> never steps back.
>>
>> Link:
>> http://lkml.kernel.org/r/0485727b2e82da7efbce5f6ba42524b429d0391a.1520011945.git.neelx@xxxxxxxxxx
>> Fixes: 864b75f9d6b01 ("mm/page_alloc: fix memmap_init_zone pageblock
>> alignment")
>> Signed-off-by: Daniel Vacek <neelx@xxxxxxxxxx>
>> Tested-by: Sudeep Holla <sudeep.holla@xxxxxxx>
>> Tested-by: Naresh Kamboju <naresh.kamboju@xxxxxxxxxx>
>> Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
>> Cc: Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx>
>> Cc: Michal Hocko <mhocko@xxxxxxxx>
>> Cc: Paul Burton <paul.burton@xxxxxxxxxx>
>> Cc: Pavel Tatashin <pasha.tatashin@xxxxxxxxxx>
>> Cc: Vlastimil Babka <vbabka@xxxxxxx>
>> Cc: <stable@xxxxxxxxxxxxxxx>
>> ---
>> mm/page_alloc.c | 7 ++++++-
>> 1 file changed, 6 insertions(+), 1 deletion(-)
>>
>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>> index 3d974cb2a1a1..e033a6895c6f 100644
>> --- a/mm/page_alloc.c
>> +++ b/mm/page_alloc.c
>> @@ -5364,9 +5364,14 @@ void __meminit memmap_init_zone(unsigned long size,
>> int nid, unsigned long zone,
>> * is not. move_freepages_block() can shift ahead
>> of
>> * the valid region but still depends on correct
>> page
>> * metadata.
>> + * Also make sure we never step back.
>> */
>> - pfn = (memblock_next_valid_pfn(pfn, end_pfn) &
>> + unsigned long next_pfn;
>> +
>> + next_pfn = (memblock_next_valid_pfn(pfn, end_pfn)
>> &
>> ~(pageblock_nr_pages-1)) - 1;
>> + if (next_pfn > pfn)
>> + pfn = next_pfn;
>
> It didn't resolve the booting hang issue in my arm64 server.
> what if memblock_next_valid_pfn(pfn, end_pfn) is 32 and pageblock_nr_pages
> is 8196?
> Thus, next_pfn will be (unsigned long)-1 and be larger than pfn.
> So still there is an infinite loop here.
Hi Jia,
Yeah, looks like another uncovered case. Noone reported this so far.
Anyways upstream reverted all this for now and we're discussing the
right approach here.
In any case thanks for this report. Can you share something like below
from your machine?
Booting Linux on physical CPU 0x0000000000 [0x410fd034]
Linux version 4.16.0-rc5-00004-gfc6eabbbf8ef-dirty (ard@dogfood) ...
Machine model: Socionext Developer Box
earlycon: pl11 at MMIO 0x000000002a400000 (options '')
bootconsole [pl11] enabled
efi: Getting EFI parameters from FDT:
efi: EFI v2.70 by Linaro
efi: SMBIOS 3.0=0xff580000 ESRT=0xf9948198 MEMATTR=0xf83b1a98
RNG=0xff7ac898
random: fast init done
efi: seeding entropy pool
esrt: Reserving ESRT space from 0x00000000f9948198 to 0x00000000f99481d0.
cma: Reserved 16 MiB at 0x00000000fd800000
NUMA: No NUMA configuration found
NUMA: Faking a node at [mem 0x0000000000000000-0x0000000fffffffff]
NUMA: NODE_DATA [mem 0xffffd8d80-0xffffda87f]
Zone ranges:
DMA32 [mem 0x0000000080000000-0x00000000ffffffff]
Normal [mem 0x0000000100000000-0x0000000fffffffff]
Movable zone start for each node
Early memory node ranges
node 0: [mem 0x0000000080000000-0x00000000febeffff]
node 0: [mem 0x00000000febf0000-0x00000000fefcffff]
node 0: [mem 0x00000000fefd0000-0x00000000ff43ffff]
node 0: [mem 0x00000000ff440000-0x00000000ff7affff]
node 0: [mem 0x00000000ff7b0000-0x00000000ffffffff]
node 0: [mem 0x0000000880000000-0x0000000fffffffff]
Initmem setup node 0 [mem 0x0000000080000000-0x0000000fffffffff]
Thank you.
--nX
> Cheers,
> Jia He
>>
>> #endif
>> continue;
>> }
>
>
> --
> Cheers,
> Jia
>