Re: [PATCH v2] mm/kmemleak: avoid soft lockup when scanning task stacks
From: Lance Yang
Date: Sat Jun 13 2026 - 07:43:05 EST
On Sat, Jun 13, 2026 at 12:45:20PM +0200, Oleg Nesterov wrote:
>To avoid the confusion, I see nothing wrong in this patch, but see
>the question at the end.
>
>On 06/12, Breno Leitao wrote:
>>
>> +/*
>> + * Briefly drop the RCU read lock to reschedule during the task stack scan.
>> + * Both cursors are pinned across the gap; return false if either one was
>> + * unhashed meanwhile, so the caller stops this round instead of walking a
>> + * stale list.
>> + */
>> +static bool kmemleak_stack_scan_break(struct task_struct *g,
>> + struct task_struct *p)
>> +{
>> + bool can_cont;
>> +
>> + get_task_struct(g);
>> + get_task_struct(p);
>> +
>> + rcu_read_unlock();
>> + cond_resched();
>> + rcu_read_lock();
>> +
>> + can_cont = pid_alive(g) && pid_alive(p);
>> +
>> + put_task_struct(p);
>> + put_task_struct(g);
>> +
>> + return can_cont;
>> +}
>
>Perhaps we can rename and export rcu_lock_break() to avoid the duplication...
>
>And, this is slightly off-topic, please ignore, but this reminds me about
>[PATCH 1/2] introduce for_each_process_thread_break() and for_each_process_thread_continue()
>https://lore.kernel.org/all/20180912163335.GA18748@xxxxxxxxxx/
>
>> @@ -1890,11 +1917,21 @@ static void kmemleak_scan(void)
>> rcu_read_lock();
>> for_each_process_thread(g, p) {
>> void *stack = try_get_task_stack(p);
>> +
>> if (stack) {
>> scan_block(stack, stack + THREAD_SIZE, NULL);
>> put_task_stack(p);
>> }
>> + /*
>> + * This is an expensive loop, we must to call the
>> + * scheduler to avoid lockups
>> + */
>> + if (need_resched() && !kmemleak_stack_scan_break(g, p)) {
>> + aborted = true;
>> + goto unlock;
>
>Can this need_resched() check actually help if CONFIG_PREEMPTION &&
>CONFIG_PREEMPT_RCU ?
Well spotted.
>In this case (lets ignore PREEMPT_DYNAMIC to simplify) rcu_read_lock()
>doesn't disable preemption and cond_resched() is nop, need_resched() is
>(almost) never true. Right?
>
>I guess even in this case it makes sense to not abuse rcu_read_lock()
>"too much", but perhaps we need something more clever than need_resched() ?
>
>Note that check_hung_uninterruptible_tasks() uses time_after()...
Ouch, right, I missed that ...
Would be better trigger the break from time_after(), not need_resched().
need_resched() may not buy much on PREEMPT_RCU ...
So yeah, a time-based check should address your concern, right?
Cheers, Lance