Re: [PATCH RFC] mm/kmemleak: avoid soft lockup when scanning task stacks
From: Lance Yang
Date: Thu Jun 11 2026 - 23:16:26 EST
On Thu, Jun 11, 2026 at 05:45:00AM -0700, Breno Leitao wrote:
>kmemleak_scan() walks every thread and scans its kernel stack under a
>single rcu_read_lock() with no reschedule point. On a host with very
>many threads -- amplified by KASAN/lockdep in debug builds -- this loop
>can hog a CPU long enough to trip the soft lockup watchdog:
>
> watchdog: BUG: soft lockup - CPU#35 stuck for 22s! [kmemleak:537]
> scan_block
> kmemleak_scan
> kmemleak_scan_thread
> kthread
Neat, good catch!
>A cond_resched() cannot be added directly: the loop runs inside an RCU
>read-side critical section.
>
>Split the scan in two parts:
>
>1) get the list of tasks (with RCU read lock) in an array
>2) run scan_block() for the tasks (with cond_reschd()).
>
>Is it a sane approach?
Why not use the kernel/hung_task.c pattern here? Seems simpler, with no
extra task-array allocation ;)
>Signed-off-by: Breno Leitao <leitao@xxxxxxxxxx>
>---
Could break RCU only when resched is needed. Pin the current cursors,
drop RCU, cond_resched(), take RCU again, and continue only if the
cursors are still alive ;)
If either cursor died while RCU was droped, stopping this scan round
should be fine, IMHO.
---8<---
diff --git a/mm/kmemleak.c b/mm/kmemleak.c
index 7c7ba17ce7af..1062d9545054 100644
--- a/mm/kmemleak.c
+++ b/mm/kmemleak.c
@@ -1695,6 +1695,26 @@ static void kmemleak_cond_resched(struct kmemleak_object *object)
put_object(object);
}
+static bool kmemleak_stack_scan_break(struct task_struct *g,
+ struct task_struct *p)
+{
+ bool can_cont;
+
+ get_task_struct(g);
+ get_task_struct(p);
+
+ rcu_read_unlock();
+ cond_resched();
+ rcu_read_lock();
+
+ can_cont = pid_alive(g) && pid_alive(p);
+
+ put_task_struct(p);
+ put_task_struct(g);
+
+ return can_cont;
+}
+
/*
* Print one leak inline. The hex dump is gated on OBJECT_ALLOCATED so it
* does not touch user memory that was freed concurrently; the rest of the
@@ -1894,7 +1914,10 @@ static void kmemleak_scan(void)
scan_block(stack, stack + THREAD_SIZE, NULL);
put_task_stack(p);
}
+ if (need_resched() && !kmemleak_stack_scan_break(g, p))
+ goto unlock;
}
+unlock:
rcu_read_unlock();
}
---
Not tested, though, feel free to grab it if looks sane :)
[...]
Cheers, Lance