Re: [PATCH] sched/fair: Fix the nohz.next_balance update mess

From: Wanpeng Li
Date: Mon Feb 06 2017 - 17:06:12 EST


2017-02-06 21:23 GMT+08:00 Vincent Guittot <vincent.guittot@xxxxxxxxxx>:
> On 6 February 2017 at 09:33, Wanpeng Li <kernellwp@xxxxxxxxx> wrote:
>> Hi Vincent,
>> 2017-02-06 16:07 GMT+08:00 Vincent Guittot <vincent.guittot@xxxxxxxxxx>:
>>> Hi Wanpeng
>>>
>>> On 5 February 2017 at 10:57, Wanpeng Li <kernellwp@xxxxxxxxx> wrote:
>>>> From: Wanpeng Li <wanpeng.li@xxxxxxxxxxx>
>>>>
>>>> The commit:
>>>> c5afb6a87f2 ("sched/fair: Fix nohz.next_balance update")
>>>>
>>>> intends to update nohz.next_balance in two steps.
>>>>
>>>> 1) The ILB CPU utilizes next_balance variable in nohz_idle_balance()
>>>> to gather the shortest next balance of other idle CPUs before
>>>> updating nohz.next_balance.
>>>> 2) The ILB CPU updates the nohz.next_balance according to its own
>>>> next_balance after load balance on behalf of other idle CPUs.
>>>>
>>>> However, there is a mess which breaks the original intention of the
>>>
>>> Have you got details of the mess that this generates ?
>>>
>>>> first step, every idle CPUs update nohz.next_balance during ILB CPU
>>>> on behalf of them to do load balance, and then the ILB CPU utilizes
>>>> next_balance variable in nohz_idle_balance() to gather the shortest
>>>> next balance of other idle CPUs before updating nohz.next_balance.
>>>>
>>>> This patch fixes it by don't update nohz.next_balance for other idle
>>>> CPUs when ILB CPU on behalf of them to do load balance.
>>>
>>> But how do you take into account the next balance of other idle CPUs ?
>>
>> The step 1) which I describe above for your original commit takes it
>> into account. In addition, please refers to the comments which you
>> added(rebalance_domains()) in the original commit:
>
> yes sorry I mixed rebalance_domains and nohz_idle_balance code
>
> But are you sure that this additional condition will change anything ?
>
> When an ILB is triggered, it means that nohz.next_balance is before jiffies.
> Then, for all Idle CPUs (except the ILB CPU), the rq->next_balance
> will be for sure after nohz.next_balance once we have finished the
> for_each_domain loop of rebalance_domain() so it can't trig
> nohz.next_balance = rq->next_balance and the current condition if
> fine.

Oh, I miss it. Thanks for your pointing out.

Regards,
Wanpeng Li