Re: [patch] mm, page_alloc: move_freepages should not examine struct page of reserved memory

From: Vlastimil Babka
Date: Wed Aug 14 2019 - 03:42:11 EST


On 8/13/19 7:22 PM, David Rientjes wrote:
> On Tue, 13 Aug 2019, Vlastimil Babka wrote:
>
>> > After commit 907ec5fca3dc ("mm: zero remaining unavailable struct pages"),
>> > struct page of reserved memory is zeroed. This causes page->flags to be 0
>> > and fixes issues related to reading /proc/kpageflags, for example, of
>> > reserved memory.
>> >
>> > The VM_BUG_ON() in move_freepages_block(), however, assumes that
>> > page_zone() is meaningful even for reserved memory. That assumption is no
>> > longer true after the aforementioned commit.
>>
>> How comes that move_freepages_block() gets called on reserved memory in
>> the first place?
>>
>
> It's simply math after finding a valid free page from the per-zone free
> area to use as fallback. We find the beginning and end of the pageblock
> of the valid page and that can bring us into memory that was reserved per
> the e820. pfn_valid() is still true (it's backed by a struct page), but
> since it's zero'd we shouldn't make any inferences here about comparing
> its node or zone. The current node check just happens to succeed most of
> the time by luck because reserved memory typically appears on node 0.
>
> The fix here is to validate that we actually have buddy pages before
> testing if there's any type of zone or node strangeness going on.

I see, thanks.


>> > @@ -2273,6 +2258,10 @@ static int move_freepages(struct zone *zone,
>> > continue;
>> > }
>> >
>> > + /* Make sure we are not inadvertently changing nodes */
>> > + VM_BUG_ON_PAGE(page_to_nid(page) != zone_to_nid(zone), page);
>> > + VM_BUG_ON_PAGE(page_zone(page) != zone, page);
>>
>> The later check implies the former check, so if it's to stay, the first
>> one could be removed and comment adjusted s/nodes/zones/
>>
>
> Does it? The first is checking for a corrupted page_to_nid the second is
> checking for a corrupted or unexpected page_zone. What's being tested
> here is the state of struct page, as it was previous to this patch, not
> the state of struct zone.

page_zone() calls page_to_nid() internally, so if nid was wrong, the resulting
zone pointer would be also wrong. But if you want more fine grained bug output,
that's fine.