Re: [PATCH v3 1/2] sched/fair: Add NOHZ balancer flag for nohz.next_balance updates
From: Dietmar Eggemann
Date: Mon Aug 23 2021 - 09:53:21 EST
On 23/08/2021 14:57, Valentin Schneider wrote:
> On 23/08/21 13:59, Peter Zijlstra wrote:
>> On Mon, Aug 23, 2021 at 12:16:59PM +0100, Valentin Schneider wrote:
>>
>>> Gate NOHZ blocked load
>>> update by the presence of NOHZ_STATS_KICK - currently all NOHZ balance
>>> kicks will have the NOHZ_STATS_KICK flag set, so no change in behaviour is
>>> expected.
>>
>>> @@ -10572,7 +10572,8 @@ static void _nohz_idle_balance(struct rq *this_rq, unsigned int flags,
>>> * setting the flag, we are sure to not clear the state and not
>>> * check the load of an idle cpu.
>>> */
>>> - WRITE_ONCE(nohz.has_blocked, 0);
>>> + if (flags & NOHZ_STATS_KICK)
>>> + WRITE_ONCE(nohz.has_blocked, 0);
>>>
>>> /*
>>> * Ensures that if we miss the CPU, we must see the has_blocked
>>> @@ -10594,13 +10595,15 @@ static void _nohz_idle_balance(struct rq *this_rq, unsigned int flags,
>>> * balancing owner will pick it up.
>>> */
>>> if (need_resched()) {
>>> - has_blocked_load = true;
>>> + if (flags & NOHZ_STATS_KICK)
>>> + has_blocked_load = true;
>>> goto abort;
>>> }
>>>
>>> rq = cpu_rq(balance_cpu);
>>>
>>> - has_blocked_load |= update_nohz_stats(rq);
>>> + if (flags & NOHZ_STATS_KICK)
>>> + has_blocked_load |= update_nohz_stats(rq);
>>>
>>> /*
>>> * If time for next balance is due,
>>> @@ -10631,8 +10634,9 @@ static void _nohz_idle_balance(struct rq *this_rq, unsigned int flags,
>>> if (likely(update_next_balance))
>>> nohz.next_balance = next_balance;
>>>
>>> - WRITE_ONCE(nohz.next_blocked,
>>> - now + msecs_to_jiffies(LOAD_AVG_PERIOD));
>>> + if (flags & NOHZ_STATS_KICK)
>>> + WRITE_ONCE(nohz.next_blocked,
>>> + now + msecs_to_jiffies(LOAD_AVG_PERIOD));
>>>
>>> abort:
>>> /* There is still blocked load, enable periodic update */
>>
>> I'm a bit puzzled by this; that function has:
>>
>> SCHED_WARN_ON((flags & NOHZ_KICK_MASK) == NOHZ_BALANCE_KICK);
>>
>> Which:
>>
>> - isn't updated
>> - implies STATS must be set when BALANCE
>
> Yup
>
>>
>> the latter gives rise to my confusion; why add that gate on STATS? It
>> just doesn't make sense to do a BALANCE and not update STATS.
>
> AFAIA that warning was only there to catch BALANCE && !STATS, so I didn't
> tweak it.
>
> Now, you could still end up with
>
> flags == NOHZ_NEXT_KICK
>
> (e.g. nohz.next_balance is in the future, but a new CPU entered NOHZ-idle
> and needs its own rq.next_balance collated into the nohz struct)
>
> in which case you don't do any blocked load update, hence the
> gate. In v1 I had that piggyback on NOHZ_STATS_KICK, but Vincent noted
> that might not be the best given blocked load updates can be time
> consuming - hence the separate flag.
Maybe the confusion stems from the fact that the NOHZ_NEXT_KICK-set
changes are only introduced in 2/2?
@@ -10417,6 +10418,9 @@ static void nohz_balancer_kick(struct rq *rq)
unlock:
rcu_read_unlock();
out:
+ if (READ_ONCE(nohz.needs_update))
+ flags |= NOHZ_NEXT_KICK;
+
@@ -10513,12 +10517,13 @@ void nohz_balance_enter_idle(int cpu)
...
+ WRITE_ONCE(nohz.needs_update, 1);