Re: [PATCH] mm/page_alloc: use batch page clearing in kernel_init_pages()
From: Andrew Morton
Date: Wed Apr 08 2026 - 11:32:43 EST
On Wed, 8 Apr 2026 16:14:03 +0530 "Salunke, Hrushikesh" <hsalunke@xxxxxxx> wrote:
> kernel_init_pages() runs inside the allocator (post_alloc_hook and
> __free_pages_prepare), so it inherits whatever context the caller is in.
> Testing with CONFIG_DEBUG_ATOMIC_SLEEP=y and CONFIG_PROVE_LOCKING=y, I
> hit this during exit_group() -> exit_mmap() -> __zap_vma_range, where a
> page allocation happens while the PTE lock and RCU read lock are held,
> making the cond_resched() in the clearing loop illegal:
>
> [ 1997.353228] BUG: sleeping function called from invalid context at mm/page_alloc.c:1235
> [ 1997.353433] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 19725, name: bash
> [ 1997.353572] preempt_count: 1, expected: 0
> [ 1997.353706] RCU nest depth: 1, expected: 0
> [ 1997.353837] 3 locks held by bash/19725:
> [ 1997.353839] #0: ff38cd415971e540 (&mm->mmap_lock){++++}-{4:4}, at: exit_mmap+0x6e/0x430
> [ 1997.353850] #1: ffffffffb03d6f60 (rcu_read_lock){....}-{1:3}, at: __pte_offset_map+0x2c/0x220
> [ 1997.353855] #2: ff38cd410deb4618 (ptlock_ptr(ptdesc)#2){+.+.}-{3:3}, at: pte_offset_map_lock+0x92/0x170
> [ 1997.353868] Call Trace:
> [ 1997.353870] <TASK>
> [ 1997.353873] dump_stack_lvl+0x91/0xb0
> [ 1997.353877] __might_resched+0x15f/0x290
> [ 1997.353882] kernel_init_pages+0x4b/0xa0
> [ 1997.353886] get_page_from_freelist+0x406/0x1e60
> [ 1997.353895] __alloc_frozen_pages_noprof+0x1d8/0x1730
> [ 1997.353912] alloc_pages_mpol+0xa4/0x190
> [ 1997.353917] alloc_pages_noprof+0x59/0xd0
> [ 1997.353919] get_free_pages_noprof+0x11/0x40
> [ 1997.353921] __tlb_remove_folio_pages_size.isra.0+0x7f/0xe0
> [ 1997.353923] __zap_vma_range+0x1bbd/0x1f40
> [ 1997.353931] unmap_vmas+0xd9/0x1d0
> [ 1997.353934] exit_mmap+0x10a/0x430
> [ 1997.353943] __mmput+0x3d/0x130
> [ 1997.353947] do_exit+0x2a7/0xae0
tlb_next_batch() is (fortunately) using GFP_NOWAIT. Perhaps you can
alter your patch to not call the cond_resched() if caller is attempting
an atomic allocation.