Re: [PATCH v1 1/1] mm: buddy page accessed before initialized

From: Pavel Tatashin
Date: Thu Nov 02 2017 - 09:40:16 EST


On 11/02/2017 09:32 AM, Michal Hocko wrote:
On Tue 31-10-17 11:50:02, Pavel Tatashin wrote:
[...]
The problem happens in this path:

page_alloc_init_late
deferred_init_memmap
deferred_init_range
__def_free
deferred_free_range
__free_pages_boot_core(page, order)
__free_pages()
__free_pages_ok()
free_one_page()
__free_one_page(page, pfn, zone, order, migratetype);

deferred_init_range() initializes one page at a time by calling
__init_single_page(), once it initializes pageblock_nr_pages pages, it
calls deferred_free_range() to free the initialized pages to the buddy
allocator. Eventually, we reach __free_one_page(), where we compute buddy
page:
buddy_pfn = __find_buddy_pfn(pfn, order);
buddy = page + (buddy_pfn - pfn);

buddy_pfn is computed as pfn ^ (1 << order), or pfn + pageblock_nr_pages.
Thefore, buddy page becomes a page one after the range that currently was
initialized, and we access this page in this function. Also, later when we
return back to deferred_init_range(), the buddy page is initialized again.

So, in order to avoid this issue, we must initialize the buddy page prior
to calling deferred_free_range().

How come we didn't have this problem previously? I am really confused.


Hi Michal,

Previously as before my project? That is because memory for all struct pages was always zeroed in memblock, and in __free_one_page() page_is_buddy() was always returning false, thus we never tried to incorrectly remove it from the list:

837 list_del(&buddy->lru);

Now, that memory is not zeroed, page_is_buddy() can return true after kexec when memory is dirty (unfortunately memset(1) with CONFIG_VM_DEBUG does not catch this case). And proceed further to incorrectly remove buddy from the list.

This is why we must initialize the computed buddy page beforehand.

Pasha