[PATCH v3 1/3] mm/kmemleak: avoid soft lockup when scanning task stacks

From: Breno Leitao

Date: Mon Jun 15 2026 - 13:51:56 EST


kmemleak_scan() walks every thread and scans its kernel stack under a
single rcu_read_lock() with no reschedule point. On a host with very
many threads -- amplified by KASAN/lockdep in debug builds -- this loop
can hog a CPU long enough to trip the soft lockup watchdog:

watchdog: BUG: soft lockup - CPU#35 stuck for 22s! [kmemleak:537]
scan_block
kmemleak_scan
kmemleak_scan_thread
kthread

A cond_resched() cannot be added directly: the loop runs inside an RCU
read-side critical section.

Walk the tasks one PID at a time with find_ge_pid(), taking the RCU read
lock only to look up and pin each task. The stack is then scanned with no
lock held, so cond_resched() runs between tasks and the scan stops early
on scan_should_stop(). This follows the next_tgid()/task_seq_get_next()
iteration pattern and keeps each RCU critical section short.

Fixes: c4b28963fd79 ("mm/kmemleak: rely on rcu for task stack scanning")
Cc: stable@xxxxxxxxxxxxxxx
Signed-off-by: Breno Leitao <leitao@xxxxxxxxxx>
---
mm/kmemleak.c | 51 ++++++++++++++++++++++++++++++++++++++-------------
1 file changed, 38 insertions(+), 13 deletions(-)

diff --git a/mm/kmemleak.c b/mm/kmemleak.c
index 7c7ba17ce7af0..a7786b6bc174e 100644
--- a/mm/kmemleak.c
+++ b/mm/kmemleak.c
@@ -1695,6 +1695,42 @@ static void kmemleak_cond_resched(struct kmemleak_object *object)
put_object(object);
}

+/*
+ * Scan all task kernel stacks, rescheduling between tasks. Each task is looked
+ * up and pinned within its own RCU read-side section, so no lock is held across
+ * the scan and the walk cannot trip the soft lockup watchdog.
+ */
+static void kmemleak_scan_task_stacks(void)
+{
+ struct pid *pid;
+ int nr = 1;
+
+ do {
+ struct task_struct *p = NULL;
+
+ rcu_read_lock();
+ pid = find_ge_pid(nr, &init_pid_ns);
+ if (pid) {
+ nr = pid_nr(pid) + 1;
+ p = pid_task(pid, PIDTYPE_PID);
+ if (p)
+ get_task_struct(p);
+ }
+ rcu_read_unlock();
+
+ if (p) {
+ void *stack = try_get_task_stack(p);
+
+ if (stack) {
+ scan_block(stack, stack + THREAD_SIZE, NULL);
+ put_task_stack(p);
+ }
+ put_task_struct(p);
+ }
+ cond_resched();
+ } while (pid && !scan_should_stop());
+}
+
/*
* Print one leak inline. The hex dump is gated on OBJECT_ALLOCATED so it
* does not touch user memory that was freed concurrently; the rest of the
@@ -1884,19 +1920,8 @@ static void kmemleak_scan(void)
/*
* Scanning the task stacks (may introduce false negatives).
*/
- if (kmemleak_stack_scan) {
- struct task_struct *p, *g;
-
- rcu_read_lock();
- for_each_process_thread(g, p) {
- void *stack = try_get_task_stack(p);
- if (stack) {
- scan_block(stack, stack + THREAD_SIZE, NULL);
- put_task_stack(p);
- }
- }
- rcu_read_unlock();
- }
+ if (kmemleak_stack_scan)
+ kmemleak_scan_task_stacks();

/*
* Scan the objects already referenced from the sections scanned

--
2.53.0-Meta