Re: [PATCH] sched/fair: Remove sd->nohz_idle

From: K Prateek Nayak

Date: Fri Feb 27 2026 - 12:33:23 EST


Hello Vincent,

On 2/27/2026 10:17 PM, Vincent Guittot wrote:
> sd->nohz_idle is used to call once inc|dec of &sd->shared->nr_busy_cpus
> when entering or leaving idle state but the call to
> set_cpu_sd_state_idle|busy is already protected by rq->nohz_tick_stopped
> being already set or clear.
>
> Remove the useless sd->nohz_idle field which equals !rq->nohz_tick_stopped.

I had looked at this recently and I believe the following is
possible with hotplug/cpuset:

CPU0 CPU1
==== ====

nohz_balance_enter_idle()
atomic_dec(&sd->shared->nr_busy_cpus);
... /* Idle */
/* Sched domains are rebuilt */
atomic_set(&sd->shared->nr_busy_cpus, sd_weight);
update_top_cache_domain(0 /* CPU0 */)
sd = highest_flag_domain(cpu, SD_SHARE_LLC);
rcu_assign_pointer(per_cpu(sd_llc, cpu), sd); /* For CPU0 */
... /* Exits idle */
...
... /* First tick hits */
/* Exits idle; First tick hits */
if (rq->nohz_tick_stopped) /* True */
nohz_balance_exit_idle()
set_cpu_sd_state_busy()
atomic_inc(&sd->shared->nr_busy_cpus) /* !!! Crosses sd_weight !!! */


I feel this per-SD indicator is necessary to avoid this scenario.
Is it fixed in some other way that I haven't realised yet?

--
Thanks and Regards,
Prateek