Re: [PATCH] sched/fair: Remove the duplicate check from group_has_capacity()

From: Valentin Schneider
Date: Tue Aug 11 2020 - 06:38:42 EST



On 11/08/20 04:39, Qi Zheng wrote:
> On 2020/8/11 上午2:33, Valentin Schneider wrote:
>>
>> On 10/08/20 02:00, Qi Zheng wrote:
>>> 1. The group_has_capacity() function is only called in
>>> group_classify().
>>> 2. The following inequality has already been checked in
>>> group_is_overloaded() which was also called in
>>> group_classify().
>>>
>>> (sgs->group_capacity * imbalance_pct) <
>>> (sgs->group_runnable * 100)
>>>
>>
>> Consider group_is_overloaded() returns false because of the first
>> condition:
>>
>> if (sgs->sum_nr_running <= sgs->group_weight)
>> return false;
>>
>> then group_has_capacity() would be the first place where the group_runnable
>> vs group_capacity comparison would be done.
>>
>> Now in that specific case we'll actually only check it if
>>
>> sgs->sum_nr_running == sgs->group_weight
>>
>> and the only case where the runnable vs capacity check can fail here is if
>> there's significant capacity pressure going on. TBH this capacity pressure
>> could be happening even when there are fewer tasks than CPUs, so I'm not
>> sure how intentional that corner case is.
>
> Maybe some cpus in sg->cpumask are no longer active at the == case,
> which causes the significant capacity pressure?
>

That can only happen in that short window between deactivating a CPU and
not having rebuilt the sched_domains yet, which sounds quite elusive.