Re: [PATCH] sched/fair: Update nohz.next_balance for newly NOHZ-idle CPUs
From: Valentin Schneider
Date: Thu Jul 15 2021 - 10:51:25 EST
On 15/07/21 15:01, Vincent Guittot wrote:
> On Thu, 15 Jul 2021 at 13:56, Valentin Schneider <valentin.schneider@xxxxxxx> wrote:
>> On 15/07/21 09:42, Vincent Guittot wrote:
>> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>> > index 44e44c235f1f..91c314f58982 100644
>> > --- a/kernel/sched/fair.c
>> > +++ b/kernel/sched/fair.c
>> > @@ -10657,6 +10657,9 @@ static void nohz_newidle_balance(struct rq *this_rq)
>> > if (this_rq->avg_idle < sysctl_sched_migration_cost)
>> > return;
>> >
>> > + if (time_before(this_rq->next_balance, READ_ONCE(nohz.next_balance))
>> > + WRITE_ONCE(nohz.need_update, 1);
>> > +
>>
>> I think we have to do this unconditionally, as we can observe the old
>> nohz.next_balance while a NOHZ balance is ongoing (which will update
>> nohz.next_balance without taking into account this newly idle CPU).
>
> so maybe add this in nohz_balance_enter_idle() after the
> smp_mb__after_atomic(). Ilb will see the cpu in the idle_cpus_mask so
> even if nohz.next_balance is updated, it will take into account this
> newly idle cpu
>
> My goal was to use mechanism similar to what is used of nohz.has_blocked
>
OK, and then clearing it above the smp_mb() in _nohz_idle_balance() should
give us similar guarantees to nohz.has_blocked (i.e. if we don't observe
the cpumask write, then we'll observe the needs_update write).
Thanks for the suggestion, I'll go test this out.