Re: [mm 4.15-rc8] Random oopses under memory pressure.

From: Linus Torvalds
Date: Wed Jan 17 2018 - 16:51:59 EST


On Wed, Jan 17, 2018 at 1:39 PM, Linus Torvalds
<torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
>
> In fact, the whole
>
> pfn_valid_within(buddy_pfn)
>
> test looks very odd. Maybe the pfn of the buddy is valid, but it's not
> in the same zone? Then we'd combine the two pages in two different
> zones into one combined page.

It might also be the same allocation zone, but if the pfn's are in
different sparsemem sections that would also be problematic.

But I hope/assume that all sparsemem sections are always aligned to
(PAGE_SIZE << MAXORDER).

In contrast, the ZONE_HIGHMEM limit really does seems to be
potentially not aligned to anything, ie

arch/x86/include/asm/pgtable_32_types.h:
#define MAXMEM (VMALLOC_END - PAGE_OFFSET - __VMALLOC_RESERVE)

which I have no idea what the alignment is, but VMALLOC_END at least
does not seem to have any MAXORDER alignment.

So it really does look like the zone for two page orders that would
otherwise be buddies might actually be different.

Interesting if this really is the case. Because afaik, if that
WARN_ON_ONCE actually triggers, it does seem like this bug could go
back pretty much forever.

In fact, it seems to be such a fundamental bug that I suspect I'm
entirely wrong, and full of shit. So it's an interesting and not
_obviously_ incorrect theory, but I suspect I must be missing
something.

Linus