Re: [PATCH v4] sched/fair: Skip sched_balance_running cmpxchg when balance is not due
From: Srikar Dronamraju
Date: Wed Nov 12 2025 - 11:02:28 EST
* Peter Zijlstra <peterz@xxxxxxxxxxxxx> [2025-11-12 14:39:37]:
> On Wed, Nov 12, 2025 at 04:55:48PM +0530, Srikar Dronamraju wrote:
>
> > If the CPU that was doing the balance was not the first CPU of the domain
> > span, but it was doing the balance since the first CPU was busy, and the
> > first CPU now happens to be idle at redo, the scheduler would have chosen the
> > first CPU to do the balance. However it will now choose the CPU that had the atomic..
> >
> > I think this is better because
> > - The first CPU may have tried just before this CPU dropped the atomic and
> > hence we may miss the balance opportunity.
> > - The first CPU and the other CPU may not be sharing cache and hence there
> > may be a cache-miss, which we are avoiding by doing this.
>
> I'm not sure I understand what you're arguing for. Are you saying it
> would be better to retain the lock where possible?
>
Yes, I was supporting keeping the lock and not check should_we_balance() with
lock held.
Lets say CPU2 enters sched_balance_rq(), should_we_balance succeeds, CPU 2 take
the lock. It calls redo, and this time should_we_balance() may not succeed for
CPU 2 (since CPU 0/1 is idle). However CPU0 may have already raced with CPU2
and tried to take the lock before CPU2 released it and bailed out. So we miss a
balancing opportunity.
>
--
Thanks and Regards
Srikar Dronamraju