Re: [PATCH v2 1/3] sched/pelt: Don't sync hardly util_sum with uti_avg
From: Dietmar Eggemann
Date: Tue Jan 11 2022 - 07:37:53 EST
On 11/01/2022 08:54, Vincent Guittot wrote:
> On Fri, 7 Jan 2022 at 16:21, Vincent Guittot <vincent.guittot@xxxxxxxxxx> wrote:
>>
>> On Fri, 7 Jan 2022 at 12:43, Dietmar Eggemann <dietmar.eggemann@xxxxxxx> wrote:
>>>
>>> On 05/01/2022 14:57, Vincent Guittot wrote:
>>>> On Wed, 5 Jan 2022 at 14:15, Dietmar Eggemann <dietmar.eggemann@xxxxxxx> wrote:
>>>>>
>>>>> On 04/01/2022 14:42, Vincent Guittot wrote:
>>>>>> On Tue, 4 Jan 2022 at 12:47, Dietmar Eggemann <dietmar.eggemann@xxxxxxx> wrote:
>>>>>>>
>>>>>>> On 22/12/2021 10:38, Vincent Guittot wrote:
[...]
>>>> Some of the changes done in PELT signal propagation that replace
>>>> subtracting util_sum by using util_avg * divider instead, are related
>>>> to other problems with sched group hierarchy and
>>>> throttling/unthrottling. I'm not 100% confident that using fixes tag
>>>> to backport this on stables doesn't need to backport more patches on
>>>> other areas in order to not resurrect old problems. So I wonder if
>>>> it's worth backporting this on stables
>>>
>>> OK, I see. So only 1c35b07e6d39 (i.e. the util _sum/_avg change in
>>> update_cfs_rq_load_avg() (1)) caused the CPU frequency regression. That
>>> was the reason why I initially suggested to split the patch-set
>>> differently. But you said that you saw the issue also when (1) is fixed.
>>
>> Ok, I think that we were not speaking about the same setup. I wrongly
>> read that you were saying that
>> sa->util_sum = max_t(u32, sa->util_sum, sa->util_avg * MIN_DIVIDER);
>> was only needed in update_cfs_rq_load_avg() but not in the other places.
>>
>> But what you said is that we only need the below to fix the perf
>> regression raised by rick ?
>> r = removed_util;
>> sub_positive(&sa->util_avg, r);
>> - sa->util_sum = sa->util_avg * divider;
>> + sub_positive(&sa->util_sum, r * divider);
>> + sa->util_sum = max_t(u32, sa->util_sum, sa->util_avg * MIN_DIVIDER);
>
> The test with the code above doesn't trigger any SCHEd_WARN over the
> weekend so it's probably ok to make a dedicated patch for this with
> tag.
> I'm going to prepare a v2
Yes, `sa->X_sum = max_t(u32, sa->X_sum, sa->X_avg * MIN_DIVIDER)` is
needed for all 3 X = [load, runnable, util] in update_cfs_rq_load_avg()
to not hit the SCHED_WARN_ON() in cfs_rq_is_decayed().
>> The WARN that I mentioned in my previous email was about not adding
>> the max_t in all 3 places. I rerun some test today and I triggered the
>> WARN after a detach without the max_t line.
>>
>> I can probably isolate the code above in a dedicated patch for the
>> regression raised by Rick and we could consider adding a fixes tag; I
>> will run more tests with only this part during the weekend.
>> That being said, we need to stay consistent in all 3 places where we
>> move or propagate some *_avg. In particular, using "sa->util_sum =
>> sa->util_avg * divider" has the drawback of filtering the contribution
>> not already accounted for in util_avg and the impact is much larger
>> than expected. It means that although fixing only
>> update_cfs_rq_load_avg() seems enough for rick's case, some other
>> cases could be impacted by the 2 other places and we need to fixed
>> them now that we have a better view of the root cause
Agreed.