Re: [PATCH] sched/fair: Restore RCU read lock in set_cpu_sd_state_{busy,idle}()
From: Peter Zijlstra
Date: Fri May 22 2026 - 03:33:59 EST
On Thu, May 21, 2026 at 10:51:15PM +0200, Andrea Righi wrote:
> Commit c9d93a73ce87 ("sched/fair: Drop redundant RCU read lock in NOHZ
> kick path") removed the rcu_read_lock()/unlock() pair from
> set_cpu_sd_state_busy() and set_cpu_sd_state_idle() on the assumption
> that all callers run in a safe context for rcu_dereference_all(): IRQs
> disabled or cpus_write_lock() held.
>
> That assumption is wrong for the CPU hotplug teardown path. When CPUs
> are taken offline, set_cpu_sd_state_busy() is invoked via:
>
> cpuhp/N kthread
> cpuhp_thread_fun()
> cpuhp_invoke_callback()
> sched_cpu_deactivate()
> nohz_balance_exit_idle()
> set_cpu_sd_state_busy()
> rcu_dereference_all(per_cpu(sd_llc, cpu))
>
> Restore the rcu_read_lock()/unlock() pair in both helpers;
> nohz_balancer_kick() is left as is, since its IRQ-disabled context is
> genuinely sufficient.
>
> Fixes: c9d93a73ce87 ("sched/fair: Drop redundant RCU read lock in NOHZ kick path")
> Reported-by: Marek Szyprowski <m.szyprowski@xxxxxxxxxxx>
> Closes: https://lore.kernel.org/all/38fe0a1d-1a48-435a-910a-c278024d9ac9@xxxxxxxxxxx/
> Signed-off-by: Andrea Righi <arighi@xxxxxxxxxx>
So the obvious alternative is to disable RCU in the one caller that
doesn't play ball.
Was that considered?
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -8699,7 +8699,8 @@ int sched_cpu_deactivate(unsigned int cp
* Remove CPU from nohz.idle_cpus_mask to prevent participating in
* load balancing when not active
*/
- nohz_balance_exit_idle(rq);
+ scoped_guard (rcu)
+ nohz_balance_exit_idle(rq);
set_cpu_active(cpu, false);