Re: [PATCH] mm/pages_alloc.c: Don't create ZONE_MOVABLE beyond the end of a node

From: Alistair Popple
Date: Tue Feb 15 2022 - 00:46:14 EST


Anshuman Khandual <anshuman.khandual@xxxxxxx> writes:

> Hi Alistair,
>
> On 2/15/22 8:28 AM, Alistair Popple wrote:
>> ZONE_MOVABLE uses the remaining memory in each node. It's starting pfn
>> is also aligned to MAX_ORDER_NR_PAGES. It is possible for the remaining
>> memory in a node to be less than MAX_ORDER_NR_PAGES, meaning there is
>> not enough room for ZONE_MOVABLE on that node.
>
> How plausible is this scenario on normal systems ?

Probably not very. I happened to run into this on my development/test x86 VM
which has 8GB and was booted with `numa=fake=4 kernelcore=60%` but in theory I
guess any system that has a node with less than MAX_ORDER_NR_PAGES left over for
ZONE_MOVABLE may be susceptible.

This was the RAM map:

[ 0.000000] BIOS-provided physical RAM map:
[ 0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009fbff] usable
[ 0.000000] BIOS-e820: [mem 0x000000000009fc00-0x000000000009ffff] reserved
[ 0.000000] BIOS-e820: [mem 0x00000000000f0000-0x00000000000fffff] reserved
[ 0.000000] BIOS-e820: [mem 0x0000000000100000-0x000000007ffddfff] usable
[ 0.000000] BIOS-e820: [mem 0x000000007ffde000-0x000000007fffffff] reserved
[ 0.000000] BIOS-e820: [mem 0x00000000b0000000-0x00000000bfffffff] reserved
[ 0.000000] BIOS-e820: [mem 0x00000000fed1c000-0x00000000fed1ffff] reserved
[ 0.000000] BIOS-e820: [mem 0x00000000feffc000-0x00000000feffffff] reserved
[ 0.000000] BIOS-e820: [mem 0x00000000fffc0000-0x00000000ffffffff] reserved
[ 0.000000] BIOS-e820: [mem 0x0000000100000000-0x000000027fffffff] usable

[...]

[ 0.065897] Early memory node ranges
[ 0.065898] node 0: [mem 0x0000000000001000-0x000000000009efff]
[ 0.065900] node 0: [mem 0x0000000000100000-0x000000007ffddfff]
[ 0.065902] node 1: [mem 0x0000000100000000-0x000000017fffffff]
[ 0.065904] node 2: [mem 0x0000000180000000-0x00000001ffffffff]
[ 0.065906] node 3: [mem 0x0000000200000000-0x000000027fffffff]

Note the reserved range from 0x000000007ffde000 to 0x000000007fffffff resulting
in node-0 ending at 0x000000007ffddfff.

> Should not the node always contain MAX_ORDER_NR_PAGES aligned pages ? Also all
> zones which get created from that node should also be MAX_ORDER_NR_PAGES
> aligned ?

I'm not sure why that would be case given page size and MAX_ORDER_NR_PAGES can
be set via a kernel configuration parameter. Obviously it wasn't the case here
or this situation would not arise. That said I don't know this code well, and
this was where I decided to stop shaving this yak so it's possible there is an
even deeper underlying issue.

Either way I don't *think* the fix should introduce any problems as it shouldn't
do anything unless you were going to hit this issue anyway (which took sometime
to track down as the cause wasn't obvious).

> I am just curious how a node could end up being like this.

- Anshuman