Re: [PATCH v3 1/3] mm/kmemleak: avoid soft lockup when scanning task stacks

From: Davidlohr Bueso

Date: Mon Jun 15 2026 - 14:56:49 EST


On Mon, 15 Jun 2026, Breno Leitao wrote:

kmemleak_scan() walks every thread and scans its kernel stack under a
single rcu_read_lock() with no reschedule point. On a host with very
many threads -- amplified by KASAN/lockdep in debug builds -- this loop
can hog a CPU long enough to trip the soft lockup watchdog:

watchdog: BUG: soft lockup - CPU#35 stuck for 22s! [kmemleak:537]
scan_block
kmemleak_scan
kmemleak_scan_thread
kthread

A cond_resched() cannot be added directly: the loop runs inside an RCU
read-side critical section.

Walk the tasks one PID at a time with find_ge_pid(), taking the RCU read
lock only to look up and pin each task. The stack is then scanned with no
lock held, so cond_resched() runs between tasks and the scan stops early
on scan_should_stop(). This follows the next_tgid()/task_seq_get_next()
iteration pattern and keeps each RCU critical section short.

Fixes: c4b28963fd79 ("mm/kmemleak: rely on rcu for task stack scanning")
Cc: stable@xxxxxxxxxxxxxxx
Signed-off-by: Breno Leitao <leitao@xxxxxxxxxx>

LGTM

Reviewed-by: Davidlohr Bueso <dave@xxxxxxxxxxxx>