Re: [PATCH] sched/pelt: sync util/runnable_sum with PELT window when propagating

From: Dietmar Eggemann
Date: Thu Apr 23 2020 - 10:30:14 EST




On 22/04/2020 17:14, Vincent Guittot wrote:
> update_tg_cfs_util/runnable() propagate the impact of the attach/detach of
> an entity down into the cfs_rq hierarchy which must keep the sync with
> the current pelt window.
>
> Even if we can't sync child rq and its group se, we can sync the group se

So we have

gcfs --> tg --> gse
________________|
|
V

cfs ---> tg (root)

|
V

rq


here. What is 'child rq' for 'group se' here? I guess 'parent cfs_rq' is
cfs_rq.

> and parent cfs_rq with current PELT window. In fact, we must keep them sync
> in order to stay also synced with others se and group se that are already
> attached to the cfs_rq.
>
> Signed-off-by: Vincent Guittot <vincent.guittot@xxxxxxxxxx>
> ---
> kernel/sched/fair.c | 26 ++++++--------------------
> 1 file changed, 6 insertions(+), 20 deletions(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 02f323b85b6d..ca6aa89c88f2 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -3441,52 +3441,38 @@ static inline void
> update_tg_cfs_util(struct cfs_rq *cfs_rq, struct sched_entity *se, struct cfs_rq *gcfs_rq)
> {
> long delta = gcfs_rq->avg.util_avg - se->avg.util_avg;
> + u32 divider = LOAD_AVG_MAX - 1024 + cfs_rq->avg.period_contrib;
>
> /* Nothing to update */
> if (!delta)
> return;
>
> - /*
> - * The relation between sum and avg is:
> - *
> - * LOAD_AVG_MAX - 1024 + sa->period_contrib
> - *
> - * however, the PELT windows are not aligned between grq and gse.
> - */
> -
> /* Set new sched_entity's utilization */
> se->avg.util_avg = gcfs_rq->avg.util_avg;
> - se->avg.util_sum = se->avg.util_avg * LOAD_AVG_MAX;
> + se->avg.util_sum = se->avg.util_avg * divider;

divider uses cfs_rq but we sync se->avg.util_avg with gcfs_rq here.

But since avg.period_contrib of cfs_rq and gcfs_rq are the same this
should work.

> /* Update parent cfs_rq utilization */
> add_positive(&cfs_rq->avg.util_avg, delta);
> - cfs_rq->avg.util_sum = cfs_rq->avg.util_avg * LOAD_AVG_MAX;
> + cfs_rq->avg.util_sum = cfs_rq->avg.util_avg * divider;

Looks like that avg.last_update_time of se (group entity), it's gcfs_rq
and cfs_rq is always the same in update_tg_cfs_[util\|runnable].

So that means the PELT windows are aligned for cfs_rqs and group se's?

And if we want to enforce this for cfs_rq and task, we have
sync_entity_load_avg().

[...]