Re: [syzbot] [mm?] INFO: rcu detected stall in purge_vmap_node

Next message: Waiman Long: "Re: [PATCH cgroup/for-6.20 v4 4/5] cgroup/cpuset: Don't invalidate sibling partitions on cpuset.cpus conflict"
Previous message: Tomas Glozar: "Re: [PATCH v1] tools/rtla: Fix parse_cpu_set() and add unit test"
In reply to: Hillf Danton: "Re: [syzbot] [mm?] INFO: rcu detected stall in purge_vmap_node"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

From: Deepanshu Kartikey

Date: Mon Jan 12 2026 - 09:50:57 EST

On Mon, Jan 13, 2026 at 07:08:XX, Hillf Danton wrote:
> Given the test result of your patch, can you specify the root cause
> of the stall reported, Deepanshu?

Hi Hillf,

Thank you for the question. Looking at the stall in the test log, this is
occurring in a different code path from what my patch addresses:

My patch fixes:
kasan_release_vmalloc_node+0x1ba/0xad0 mm/vmalloc.c:2299
purge_vmap_node+0x1ba/0xad0

New stall location:
__reset_page_owner+0x84/0x1a0
__free_frozen_pages+0x7df/0x1170
vfree+0x1fd/0xb50
cleanup_vm_area_work+0x4c/0x100

The root cause pattern is similar (CONFIG_PAGE_OWNER stack unwinding under
RCU read lock), but manifesting in the page freeing path rather than the
KASAN shadow cleanup path.

My patch specifically addresses the unbounded loop in
kasan_release_vmalloc_node()
where we iterate through large purge_lists. The new stall appears to be in
__reset_page_owner() during page freeing, which would need a separate fix in
that code path.

Should I focus on submitting v2 of my vmalloc fix, or would you prefer I
investigate the page_owner stall as well?

Best regards,
Deepanshu