Re: [PATCH v2] sched/fair: fix initial util_avg calculation

From: Vishal Chourasia
Date: Tue Apr 02 2024 - 04:58:29 EST


On 02/04/24 2:19 pm, Vincent Guittot wrote:
> On Tue, 2 Apr 2024 at 10:44, Vishal Chourasia <vishalc@xxxxxxxxxxxxx> wrote:
>>
>> On 02/04/24 11:17 am, Dawei Li wrote:
>>> Hi Vishal
>>>
>>> Thanks for the comment!
>>> Do you suggest using scale_load_down() in place of se_weight()?
>> scale_load_down should be better.
>
> se_weight is used for computing sched_entity's pelt signal so keep
> using it looks better but all this clearly just nitpick because that
> doesn't make any difference
Alright. Thank you for the clarification.
>
>>> It's a soft bug we should fix one way or another before what the
>>> comment mentions really happens.
>> IIUC, We should be moving towards using full load resolution
>> for all the calculations. In that case, we need not worry about scaling load at
>> all. Maybe someone could provide context here.
>>
>>> I am actually confused that we have both se_weight() and
>>> scale_load_down(), and they do the same thing.
>>>
>>> Best regards,
>>> Dawei
>>>
>>> On Mon, Apr 1, 2024 at 3:36 AM Vishal Chourasia <vishalc@xxxxxxxxxxxxx> wrote:
>>>>
>>>> On Thu, Mar 14, 2024 at 06:59:16PM -0700, Dawei Li wrote:
>>>>> Change se->load.weight to se_weight(se) in the calculation for the
>>>>> initial util_avg to avoid unnecessarily inflating the util_avg by 1024
>>>>> times.
>>>>>
>>>>> The reason is that se->load.weight has the unit/scale as the scaled-up
>>>>> load, while cfs_rg->avg.load_avg has the unit/scale as the true task
>>>>> weight (as mapped directly from the task's nice/priority value). With
>>>>> CONFIG_32BIT, the scaled-up load is equal to the true task weight. With
>>>>> CONFIG_64BIT, the scaled-up load is 1024 times the true task weight.
>>>>> Thus, the current code may inflate the util_avg by 1024 times. The
>>>>> follow-up capping will not allow the util_avg value to go wild. But the
>>>>> calculation should have the correct logic.
>>>>>
>>>>> Signed-off-by: Dawei Li <daweilics@xxxxxxxxx>
>>>>> ---
>>>>> Changes in v2:
>>>>> - update the commit message
>>>>> ---
>>>>> kernel/sched/fair.c | 5 +++--
>>>>> 1 file changed, 3 insertions(+), 2 deletions(-)
>>>>>
>>>>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>>>>> index a19ea290b790..5f98f639bdb9 100644
>>>>> --- a/kernel/sched/fair.c
>>>>> +++ b/kernel/sched/fair.c
>>>>> @@ -1031,7 +1031,8 @@ void init_entity_runnable_average(struct sched_entity *se)
>>>>> * With new tasks being created, their initial util_avgs are extrapolated
>>>>> * based on the cfs_rq's current util_avg:
>>>>> *
>>>>> - * util_avg = cfs_rq->util_avg / (cfs_rq->load_avg + 1) * se.load.weight
>>>>> + * util_avg = cfs_rq->avg.util_avg / (cfs_rq->avg.load_avg + 1)
>>>>> + * * se_weight(se)
>>>>> *
>>>>> * However, in many cases, the above util_avg does not give a desired
>>>>> * value. Moreover, the sum of the util_avgs may be divergent, such
>>>>> @@ -1078,7 +1079,7 @@ void post_init_entity_util_avg(struct task_struct *p)
>>>>>
>>>>> if (cap > 0) {
>>>>> if (cfs_rq->avg.util_avg != 0) {
>>>>> - sa->util_avg = cfs_rq->avg.util_avg * se->load.weight;
>>>>> + sa->util_avg = cfs_rq->avg.util_avg * se_weight(se);
>>>> Hi,
>>>>
>>>> The comment above the declaration of se_weight function says we should be
>>>> using full load resolution and get rid of this helper.
>>>>
>>>> Should we be adding new user of the helper?
>>>>
>>>> /*
>>>> * XXX we want to get rid of these helpers and use the full load resolution.
>>>> */
>>>> static inline long se_weight(struct sched_entity *se)
>>>> {
>>>> return scale_load_down(se->load.weight);
>>>> }
>>>>
>>>>
>>>>> sa->util_avg /= (cfs_rq->avg.load_avg + 1);
>>>>>
>>>>> if (sa->util_avg > cap)
>>>>> --
>>>>> 2.40.1
>>>>>
>>