Re: [PATCH] sched/fair: Fix nohz.next_balance update

From: Valentin Schneider
Date: Wed May 06 2020 - 06:28:40 EST



On 04/05/20 16:17, Vincent Guittot wrote:
>> Since we can gather all the updated rq->next_balance, including this_cpu,
>> in _nohz_idle_balance(), it's safe to remove the extra lines in
>> rebalance_domains() which are originally intended for this_cpu. And
>> finally the updating only happen in _nohz_idle_balance().
>
> I'm not sure that's always true. Nothing prevents nohz_idle_balance()
> to return false . Then run_rebalance_domains() calls
> rebalance_domains(this_rq ,SCHED_IDLE) outside _nohz_idle_balance().
> In this case we must keep the code in rebalance_domains().
>
> For example when the tick is not stopped when entering idle. Or when
> need_resched() returns true.
>

Going back to this; nohz_idle_balance() will return true regardless of the
return value of _nohz_idle_balance(), so AFAICT we won't fall through to
the rebalance_domains() in run_rebalance_domains() in case we had
need_resched() in _nohz_idle_balance().

This was changed in b7031a02ec75 ("sched/fair: Add NOHZ_STATS_KICK");
before then we would always have the local rebalance_domains(). Now, since
the bail out is caused by need_resched(), I think it's not such a crazy
thing *not* to do the local rebalance_domains(), but I wasn't super clear
on all of this.