Re: [PATCH v3] sched/fair: Sync se's load_avg with cfs_rq in reweight_task

From: Dietmar Eggemann
Date: Wed Jul 24 2024 - 05:07:02 EST


On 24/07/2024 04:12, Chengming Zhou wrote:
> On 2024/7/24 07:00, Vishal Chourasia wrote:
>> On 24/07/24 2:40 am, Dietmar Eggemann wrote:
>>> On 23/07/2024 17:48, Vishal Chourasia wrote:
>>>> On Tue, Jul 23, 2024 at 07:42:47PM +0800, Chuyi Zhou wrote:

[...]

>>>> The difference between using update_load_avg() and
>>>> sync_entity_load_avg() is:
>>>> 1. update_load_avg() uses the updated PELT clock value from the rq
>>>>     structure.
>>>> 2. sync_entity_load_avg() uses the last updated time of
>>>>     the cfs_rq where the scheduling entity (se) is attached.
>>>>
>>>> Won't this affect the entity load sync?
>>>
>>> Not sure what you mean exactly by entity load sync here.
>> load avg sync for the wakee
>>>
>>> The task has been sleeping for a long time, i.e. its PELT values haven't
>>> been updated or a long time (its last_update_time (lut) value is pretty
>>> old).
>>>
>>> In this meantime the task's cfs_rq has potentially seen other PELT
>>> updates due to PELT updates of other non-sleeping tasks related to this
>>> cfs_rq. I.e. the cfs_rq lut is much more recent.
>>>
>>> What we want to do here is to sync the sleeping task with its cfs_rq. If
>>> the task was sleeping for more than 1us (1024ns) and we cross a 1ms PELT
>>> period (1024us) when we use cfs_rq's lut as the 'now' value for
>>> __update_load_avg_blocked_se() then we will see the task PELT values
>>> decay.
>>>
>>> We rely on sync_entity_load_avg() for instance in EAS wakeup where the
>>> task's util_avg influences on which CPU type the task will run next. So
>>> we sync the wakee with its cfs_rq to be able to work with a current task
>>> util_avg.
>> I was talking about the case where all the tasks on a cfs_rq are
>> sleeping.
>> In this case, lut of the cfs_rq will be same as, at the time of last
>> dequeue.
>
> In this case, cfs_rq is not on_rq, its load_sum/avg will be updated when
> enqueue next time. (Or periodically updated from load balance)

Yes, cfs_rq PELT values of an idle CPU decay via
sched_balance_update_blocked_averages() -> __update_blocked_fair()

[...]