Re: [PATCH] sched/fair: Restore RCU read lock in set_cpu_sd_state_{busy,idle}()
From: Andrea Righi
Date: Fri May 22 2026 - 05:11:40 EST
Hi Peter,
On Fri, May 22, 2026 at 09:28:53AM +0200, Peter Zijlstra wrote:
> On Thu, May 21, 2026 at 10:51:15PM +0200, Andrea Righi wrote:
> > Commit c9d93a73ce87 ("sched/fair: Drop redundant RCU read lock in NOHZ
> > kick path") removed the rcu_read_lock()/unlock() pair from
> > set_cpu_sd_state_busy() and set_cpu_sd_state_idle() on the assumption
> > that all callers run in a safe context for rcu_dereference_all(): IRQs
> > disabled or cpus_write_lock() held.
> >
> > That assumption is wrong for the CPU hotplug teardown path. When CPUs
> > are taken offline, set_cpu_sd_state_busy() is invoked via:
> >
> > cpuhp/N kthread
> > cpuhp_thread_fun()
> > cpuhp_invoke_callback()
> > sched_cpu_deactivate()
> > nohz_balance_exit_idle()
> > set_cpu_sd_state_busy()
> > rcu_dereference_all(per_cpu(sd_llc, cpu))
>
> >
> > Restore the rcu_read_lock()/unlock() pair in both helpers;
> > nohz_balancer_kick() is left as is, since its IRQ-disabled context is
> > genuinely sufficient.
> >
> > Fixes: c9d93a73ce87 ("sched/fair: Drop redundant RCU read lock in NOHZ kick path")
> > Reported-by: Marek Szyprowski <m.szyprowski@xxxxxxxxxxx>
> > Closes: https://lore.kernel.org/all/38fe0a1d-1a48-435a-910a-c278024d9ac9@xxxxxxxxxxx/
> > Signed-off-by: Andrea Righi <arighi@xxxxxxxxxx>
>
> So the obvious alternative is to disable RCU in the one caller that
> doesn't play ball.
>
> Was that considered?
This also works (tested, just in case). Since the original intent was to drop
these redundant RCU read locks, we should probably go this way. I'll send a new
patch shortly with your Suggested-by.
Thanks!
-Andrea
>
>
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -8699,7 +8699,8 @@ int sched_cpu_deactivate(unsigned int cp
> * Remove CPU from nohz.idle_cpus_mask to prevent participating in
> * load balancing when not active
> */
> - nohz_balance_exit_idle(rq);
> + scoped_guard (rcu)
> + nohz_balance_exit_idle(rq);
>
> set_cpu_active(cpu, false);
>
>