Re: [PATCH v2] mm/page_alloc: fix alloc_pages_bulk/set_page_owner panic on irq disabled

From: Mel Gorman
Date: Thu Jul 15 2021 - 07:57:35 EST


On Mon, Jul 12, 2021 at 10:23:32AM +0800, Yang Huan wrote:
> BUG: sleeping function called from invalid context at mm/page_alloc.c:5179
> in_atomic(): 0, irqs_disabled(): 1, non_block: 0, pid: 1, name: swapper/0
>
> __dump_stack lib/dump_stack.c:79 [inline]
> dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:96
> ___might_sleep.cold+0x1f1/0x237 kernel/sched/core.c:9153
> prepare_alloc_pages+0x3da/0x580 mm/page_alloc.c:5179
> __alloc_pages+0x12f/0x500 mm/page_alloc.c:5375
> alloc_pages+0x18c/0x2a0 mm/mempolicy.c:2272
> stack_depot_save+0x39d/0x4e0 lib/stackdepot.c:303
> save_stack+0x15e/0x1e0 mm/page_owner.c:120
> __set_page_owner+0x50/0x290 mm/page_owner.c:181
> prep_new_page mm/page_alloc.c:2445 [inline]
> __alloc_pages_bulk+0x8b9/0x1870 mm/page_alloc.c:5313
>
> The problem is caused by set_page_owner alloc memory to save stack with
> GFP_KERNEL in local_riq disabled.
> So, we just can't assume that alloc flags should be same with new page,
> prep_new_page should prep/trace the page gfp, but shouldn't use the same
> gfp to get memory, let's depend on caller.
> So, here is two gfp flags, alloc_gfp used to alloc memory, depend on
> caller, page_gfp_mask is page's gfp, used to trace/prep itself
> But in most situation, same is ok, in alloc_pages_bulk, use GFP_ATOMIC
> is ok.(even if set_page_owner save backtrace failed, limited impact)
>
> v2:
> - add more description.
>
> Fixes: 0f87d9d30f21 ("mm/page_alloc: add an array-based interface to the bulk page allocator")
> Reported-by: syzbot+b07d8440edb5f8988eea@xxxxxxxxxxxxxxxxxxxxxxxxx
> Suggested-by: Wang Qing <wangqing@xxxxxxxx>
> Signed-off-by: Yang Huan <link@xxxxxxxx>

https://lore.kernel.org/lkml/20210713152100.10381-2-mgorman@xxxxxxxxxxxxxxxxxxx/
is now part of a series that has being sent to Linus. Hence, the Fixes
part is no longer applicable and the patch will no longer be addresing
an atomic sleep bug. This patch should be treated as an enhancement
to allow bulk allocations when PAGE_OWNER is set. For that, it should
include a note on the performance if PAGE_OWNER is used with either NFS
or high-speed networking to justify the additional complexity.

--
Mel Gorman
SUSE Labs