[PATCH 2/2] sched/fair: distribute nohz ILB work across idle CPUs.

From: Imran Khan

Date: Tue Apr 21 2026 - 01:07:49 EST


find_new_ilb() uses for_each_cpu_and() to iterate nohz.idle_cpus_mask
from the lowest bit upward, returning the first idle housekeeping CPU
it finds. This can (unfairly) select the lowest nohz idle CPU most of
the times.

Fix this by selecting nohz ILB CPU in a round robin way and thus
distributing the nohz ILB work (which can be significant on large
scale systems) across all eligible idle CPUs.

Signed-off-by: Imran Khan <imran.f.khan@xxxxxxxxxx>
---
kernel/sched/fair.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index bd35275a05b38..93bdb542ff714 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -7213,6 +7213,7 @@ static struct {
cpumask_var_t idle_cpus_mask;
int has_blocked_load; /* Idle CPUS has blocked load */
int needs_update; /* Newly idle CPUs need their next_balance collated */
+ int ilb_cpu_last; /* Last CPU selected for nohz ILB */
unsigned long next_balance; /* in jiffy units */
unsigned long next_blocked; /* Next update of blocked load in jiffies */
} nohz ____cacheline_aligned;
@@ -12420,13 +12421,17 @@ static inline int find_new_ilb(void)

hk_mask = housekeeping_cpumask(HK_TYPE_KERNEL_NOISE);

- for_each_cpu_and(ilb_cpu, nohz.idle_cpus_mask, hk_mask) {
+ for_each_cpu_wrap(ilb_cpu, nohz.idle_cpus_mask, nohz.ilb_cpu_last + 1) {
+ if (!cpumask_test_cpu(ilb_cpu, hk_mask))
+ continue;

if (ilb_cpu == smp_processor_id())
continue;

- if (idle_cpu(ilb_cpu))
+ if (idle_cpu(ilb_cpu)) {
+ nohz.ilb_cpu_last = ilb_cpu;
return ilb_cpu;
+ }
}

return -1;
--
2.34.1