Re: [PATCH v4 1/3] sched/fair: Move checking for nohz cpus after time check

From: Shrikanth Hegde

Date: Tue Jan 13 2026 - 06:18:32 EST




On 1/13/26 3:51 PM, Vincent Guittot wrote:
On Tue, 13 Jan 2026 at 10:23, Shrikanth Hegde <sshegde@xxxxxxxxxxxxx> wrote:



On 1/13/26 2:37 PM, Vincent Guittot wrote:
On Mon, 12 Jan 2026 at 06:05, Shrikanth Hegde <sshegde@xxxxxxxxxxxxx> wrote:

NOHZ idle load balancer is kicked off only after time check. So move
the atomic read after the time check to access it only when needed.

When there are no idle CPUs(100% busy), even if the flag gets set to
NOHZ_STATS_KICK | NOHZ_NEXT_KICK, find_new_ilb will fail and
there will be no NOHZ idle balance. The current behaviour is retained.

Note: This patch doesn't solve any cacheline overheads. No improvement
in performance apart from saving a few cycles of atomic_read.

But won't these cycles be then wasted by calling needlessly kick_ilb


when there are nohz cpus, i.e nohz.nr_cpus > 0, there is no change in codeflow.

Only when system is 100%(which is expected to be rare), nohz.nr_cpus == 0,
then it is expected that has_blocked_load = 0. So flags shouldn't be set.

The way we are setting/clearing has_blocked_load vs
nr_cpus/idle_cpus_mask implies that it's possible to get
has_blocked_load == 1 but nr_cpus == 0 although it's a corner case and
not a default behavior

No CPUs are idle: nr_cpus == 0

CPU 0 enters idle
- inc nr_cpus and set idle_cpus_mask
- set nohz.has_blocked

CPU0 wakes up

Tick fires on CPU0
- dec nr_cpus and clear idle_cpus_mask
- nohz.has_blocked == 1, most probably now > nohz.next_blocked, if
now < nohz.next_balance, we skip the test of nr_cpus and we call
kick_ilb() but nr_cpus == 0 and idle_cpus_mask is empty

Note we are still doing a return if nohz.nr_cpus == 0. So kick_ilb shouldn't be
called.

The return can be skipped by if (time_before(now, nohz.next_balance)) goto out


Assuming HZ=1000,

I see LOAD_AVG_PERIOD = 32, whereas next_balance is usually 60. so it is a really narrow window of 28 ticks and system being close to 100% busy.



Do you see any path still calling kick_ilb un-necessarily?

Yes but at the same time it's clearly not the main case



So i assume we can do this patch considering the common case?
If not, let me know, I can drop it.