Re: [PATCH] sched/rt: use for_each_cpu_wrap to iterate over rto_mask

From: Peter Zijlstra
Date: Fri Nov 15 2024 - 05:06:07 EST


On Thu, Nov 14, 2024 at 03:05:58PM -0700, Jon Kohler wrote:
> When using NO_RT_PUSH_IPI, using for_each_cpu() over rto_mask may cause
> many CPUs to attempt to pull load from the same CPU, causing RQ
> lock contention.
>
> Use for_each_cpu_wrap instead to spread out which RQ gets evaluated
> first, similar to how _nohz_idle_balance iterates over idle_cpus_mask.
> This strategy is beneficial when there are many CPUs in rto_mask and
> many other CPUs going in and out of schedule() at the same time.
>
> Signed-off-by: Jon Kohler <jon@xxxxxxxxxxx>
> ---
> kernel/sched/rt.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
> index 172c588de542..c883ff122f5d 100644
> --- a/kernel/sched/rt.c
> +++ b/kernel/sched/rt.c
> @@ -2308,7 +2308,7 @@ static void pull_rt_task(struct rq *this_rq)
> }
> #endif
>
> - for_each_cpu(cpu, this_rq->rd->rto_mask) {
> + for_each_cpu_wrap(cpu, this_rq->rd->rto_mask, this_cpu+1) {
> if (this_cpu == cpu)
> continue;

Works for me I suppose, but as with that other rt patch, please do the
matching change for dl too.