RE: [PATCH] SCHED: scatter nohz idle balance target cpus

From: Jianyong Wu
Date: Tue Mar 18 2025 - 07:36:38 EST


Hi Peter,

Thanks for replay.

> -----Original Message-----
> From: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
> Sent: Tuesday, March 18, 2025 2:39 PM
> To: Jianyong Wu <wujianyong@xxxxxxxx>
> Cc: mingo@xxxxxxxxxx; vincent.guittot@xxxxxxxxxx; jianyong.wu@xxxxxxxxxxx;
> linux-kernel@xxxxxxxxxxxxxxx
> Subject: Re: [PATCH] SCHED: scatter nohz idle balance target cpus
>
> On Tue, Mar 18, 2025 at 02:23:58AM +0000, Jianyong Wu wrote:
>
> Re subject; if you look at other patches for sched, you'll note that we do not
> capitalize it.
[Jianyong Wu]
OK, will correct it.

> Also, what you're doing is not scatter, it is rotation.
[Jianyong Wu]
It seems. However, as nohz idle balance occurs quite frequently, in proportion to the value of "HZ" and CPUs number, the selected CPU will spread to the whole system soon. Compared with current situation where mostly only cpu0 is selected, the change in this patch is more like a "scattering" effect. But it doesn't matter, "rotation" is fine. I can change it according your wish.

>
> > Currently, cpu selection logic for nohz idle balance lacks history
> > info that leads to cpu0 is always chosen if it's in nohz cpu mask.
> > It's not fair fot the tasks reside in numa node0. It's worse in the
> > machine with large cpu number, nohz idle balance may be very heavy.
>
> Since you seem to care about ilb and numa; there is this _very_ old patch set
> that never got finished:
>
> https://lore.kernel.org/all/20091211013056.305998000@xxxxxxxxx/
>
[Jianyong Wu]
Thanks for this. I'll look into it (may take some time)

> IIRC there was a problem where it would simply stop running the per-node ilb
> when the node went idle, leading to node level imbalances. This should be
> curable by picking one such idle node and keeping its ILB active or somesuch.
[Jianyong Wu]
I think this patch is simple enough to achieve the "fairness". WDYT?
>
> Something to poke at if you're interested..
>