Re: [PATCH] sched/fair: Remove sd->nohz_idle

From: Vincent Guittot

Date: Tue Mar 03 2026 - 10:05:27 EST


On Fri, 27 Feb 2026 at 18:33, K Prateek Nayak <kprateek.nayak@xxxxxxx> wrote:
>
> Hello Vincent,
>
> On 2/27/2026 10:17 PM, Vincent Guittot wrote:
> > sd->nohz_idle is used to call once inc|dec of &sd->shared->nr_busy_cpus
> > when entering or leaving idle state but the call to
> > set_cpu_sd_state_idle|busy is already protected by rq->nohz_tick_stopped
> > being already set or clear.
> >
> > Remove the useless sd->nohz_idle field which equals !rq->nohz_tick_stopped.
>
> I had looked at this recently and I believe the following is
> possible with hotplug/cpuset:
>
> CPU0 CPU1
> ==== ====
>
> nohz_balance_enter_idle()
> atomic_dec(&sd->shared->nr_busy_cpus);
> ... /* Idle */
> /* Sched domains are rebuilt */
> atomic_set(&sd->shared->nr_busy_cpus, sd_weight);
> update_top_cache_domain(0 /* CPU0 */)
> sd = highest_flag_domain(cpu, SD_SHARE_LLC);
> rcu_assign_pointer(per_cpu(sd_llc, cpu), sd); /* For CPU0 */
> ... /* Exits idle */
> ...
> ... /* First tick hits */
> /* Exits idle; First tick hits */
> if (rq->nohz_tick_stopped) /* True */
> nohz_balance_exit_idle()
> set_cpu_sd_state_busy()
> atomic_inc(&sd->shared->nr_busy_cpus) /* !!! Crosses sd_weight !!! */
>
>
> I feel this per-SD indicator is necessary to avoid this scenario.
> Is it fixed in some other way that I haven't realised yet?

You're right !
And I don't have an easy way to fix this


>
> --
> Thanks and Regards,
> Prateek
>