Re: [PATCH v9 01/12] x86/mm: setting fields in deferred pages

From: Michal Hocko
Date: Tue Oct 03 2017 - 08:27:10 EST


On Wed 20-09-17 16:17:03, Pavel Tatashin wrote:
> Without deferred struct page feature (CONFIG_DEFERRED_STRUCT_PAGE_INIT),
> flags and other fields in "struct page"es are never changed prior to first
> initializing struct pages by going through __init_single_page().
>
> With deferred struct page feature enabled, however, we set fields in
> register_page_bootmem_info that are subsequently clobbered right after in
> free_all_bootmem:
>
> mem_init() {
> register_page_bootmem_info();
> free_all_bootmem();
> ...
> }
>
> When register_page_bootmem_info() is called only non-deferred struct pages
> are initialized. But, this function goes through some reserved pages which
> might be part of the deferred, and thus are not yet initialized.
>
> mem_init
> register_page_bootmem_info
> register_page_bootmem_info_node
> get_page_bootmem
> .. setting fields here ..
> such as: page->freelist = (void *)type;
>
> free_all_bootmem()
> free_low_memory_core_early()
> for_each_reserved_mem_region()
> reserve_bootmem_region()
> init_reserved_page() <- Only if this is deferred reserved page
> __init_single_pfn()
> __init_single_page()
> memset(0) <-- Loose the set fields here
>
> We end-up with issue where, currently we do not observe problem as memory
> is explicitly zeroed. But, if flag asserts are changed we can start hitting
> issues.
>
> Also, because in this patch series we will stop zeroing struct page memory
> during allocation, we must make sure that struct pages are properly
> initialized prior to using them.
>
> The deferred-reserved pages are initialized in free_all_bootmem().
> Therefore, the fix is to switch the above calls.

Thanks for extending the changelog. This is more informative now.

> Signed-off-by: Pavel Tatashin <pasha.tatashin@xxxxxxxxxx>
> Reviewed-by: Steven Sistare <steven.sistare@xxxxxxxxxx>
> Reviewed-by: Daniel Jordan <daniel.m.jordan@xxxxxxxxxx>
> Reviewed-by: Bob Picco <bob.picco@xxxxxxxxxx>

I hope I haven't missed anything but it looks good to me.

Acked-by: Michal Hocko <mhocko@xxxxxxxx>

one nit below
> ---
> arch/x86/mm/init_64.c | 9 +++++++--
> 1 file changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
> index 5ea1c3c2636e..30fe22558720 100644
> --- a/arch/x86/mm/init_64.c
> +++ b/arch/x86/mm/init_64.c
> @@ -1182,12 +1182,17 @@ void __init mem_init(void)
>
> /* clear_bss() already clear the empty_zero_page */
>
> - register_page_bootmem_info();
> -
> /* this will put all memory onto the freelists */
> free_all_bootmem();
> after_bootmem = 1;
>
> + /* Must be done after boot memory is put on freelist, because here we

standard code style is to do
/*
* text starts here

> + * might set fields in deferred struct pages that have not yet been
> + * initialized, and free_all_bootmem() initializes all the reserved
> + * deferred pages for us.
> + */
> + register_page_bootmem_info();
> +
> /* Register memory areas for /proc/kcore */
> kclist_add(&kcore_vsyscall, (void *)VSYSCALL_ADDR,
> PAGE_SIZE, KCORE_OTHER);
> --
> 2.14.1

--
Michal Hocko
SUSE Labs