Re: [patch] mm, page_alloc: move_freepages should not examine struct page of reserved memory
From: David Rientjes
Date: Tue Aug 13 2019 - 19:31:39 EST
On Tue, 13 Aug 2019, Andrew Morton wrote:
> > After commit 907ec5fca3dc ("mm: zero remaining unavailable struct pages"),
> > struct page of reserved memory is zeroed. This causes page->flags to be 0
> > and fixes issues related to reading /proc/kpageflags, for example, of
> > reserved memory.
> >
> > The VM_BUG_ON() in move_freepages_block(), however, assumes that
> > page_zone() is meaningful even for reserved memory. That assumption is no
> > longer true after the aforementioned commit.
> >
> > There's no reason why move_freepages_block() should be testing the
> > legitimacy of page_zone() for reserved memory; its scope is limited only
> > to pages on the zone's freelist.
> >
> > Note that pfn_valid() can be true for reserved memory: there is a backing
> > struct page. The check for page_to_nid(page) is also buggy but reserved
> > memory normally only appears on node 0 so the zeroing doesn't affect this.
> >
> > Move the debug checks to after verifying PageBuddy is true. This isolates
> > the scope of the checks to only be for buddy pages which are on the zone's
> > freelist which move_freepages_block() is operating on. In this case, an
> > incorrect node or zone is a bug worthy of being warned about (and the
> > examination of struct page is acceptable bcause this memory is not
> > reserved).
>
> I'm thinking Fixes:907ec5fca3dc and Cc:stable? But 907ec5fca3dc is
> almost a year old, so you were doing something special to trigger this?
>
We noticed it almost immediately after bringing 907ec5fca3dc in on
CONFIG_DEBUG_VM builds. It depends on finding specific free pages in the
per-zone free area where the math in move_freepages() will bring the start
or end pfn into reserved memory and wanting to claim that entire pageblock
as a new migratetype. So the path will be rare, require CONFIG_DEBUG_VM,
and require fallback to a different migratetype.
Some struct pages were already zeroed from reserve pages before
907ec5fca3c so it theoretically could trigger before this commit. I think
it's rare enough under a config option that most people don't run that
others may not have noticed. I wouldn't argue against a stable tag and
the backport should be easy enough, but probably wouldn't single out a
commit that this is fixing.