Re: [PATCH] sched/pelt: sync util/runnable_sum with PELT window when propagating
From: Vincent Guittot
Date: Thu Apr 23 2020 - 12:17:34 EST
On Thu, 23 Apr 2020 at 16:30, Dietmar Eggemann <dietmar.eggemann@xxxxxxx> wrote:
>
>
>
> On 22/04/2020 17:14, Vincent Guittot wrote:
> > update_tg_cfs_util/runnable() propagate the impact of the attach/detach of
> > an entity down into the cfs_rq hierarchy which must keep the sync with
> > the current pelt window.
> >
> > Even if we can't sync child rq and its group se, we can sync the group se
>
> So we have
>
> gcfs --> tg --> gse
> ________________|
> |
> V
>
> cfs ---> tg (root)
>
> |
> V
>
> rq
>
child cfs_rq aka gcfs_rq
|
gse: group entity that represents child cfs_rq in parent cfs_rq
|
v
parent cfs_rq aka cfs_rq
>
> here. What is 'child rq' for 'group se' here? I guess 'parent cfs_rq' is
> cfs_rq.
>
> > and parent cfs_rq with current PELT window. In fact, we must keep them sync
> > in order to stay also synced with others se and group se that are already
> > attached to the cfs_rq.
> >
> > Signed-off-by: Vincent Guittot <vincent.guittot@xxxxxxxxxx>
> > ---
> > kernel/sched/fair.c | 26 ++++++--------------------
> > 1 file changed, 6 insertions(+), 20 deletions(-)
> >
> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > index 02f323b85b6d..ca6aa89c88f2 100644
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -3441,52 +3441,38 @@ static inline void
> > update_tg_cfs_util(struct cfs_rq *cfs_rq, struct sched_entity *se, struct cfs_rq *gcfs_rq)
> > {
> > long delta = gcfs_rq->avg.util_avg - se->avg.util_avg;
> > + u32 divider = LOAD_AVG_MAX - 1024 + cfs_rq->avg.period_contrib;
> >
> > /* Nothing to update */
> > if (!delta)
> > return;
> >
> > - /*
> > - * The relation between sum and avg is:
> > - *
> > - * LOAD_AVG_MAX - 1024 + sa->period_contrib
> > - *
> > - * however, the PELT windows are not aligned between grq and gse.
> > - */
> > -
> > /* Set new sched_entity's utilization */
> > se->avg.util_avg = gcfs_rq->avg.util_avg;
> > - se->avg.util_sum = se->avg.util_avg * LOAD_AVG_MAX;
> > + se->avg.util_sum = se->avg.util_avg * divider;
>
> divider uses cfs_rq but we sync se->avg.util_avg with gcfs_rq here.
we sync the util_avg of gse with the new util_avg of gcfs_rq but gse
is attached to cfs_rq and as a result we have to use cfs_rq's
period_contrib
>
> But since avg.period_contrib of cfs_rq and gcfs_rq are the same this
> should work.
>
> > /* Update parent cfs_rq utilization */
> > add_positive(&cfs_rq->avg.util_avg, delta);
> > - cfs_rq->avg.util_sum = cfs_rq->avg.util_avg * LOAD_AVG_MAX;
> > + cfs_rq->avg.util_sum = cfs_rq->avg.util_avg * divider;
>
> Looks like that avg.last_update_time of se (group entity), it's gcfs_rq
> and cfs_rq is always the same in update_tg_cfs_[util\|runnable].
>
> So that means the PELT windows are aligned for cfs_rqs and group se's?
We want to align util_avg with util_sum and period_contrib otherwise
we might have some unalignment. It's quite similarly to what is done
in attach_entity_load_avg()
>
> And if we want to enforce this for cfs_rq and task, we have
> sync_entity_load_avg().
It's not a matter of syncing the last_update_time
>
> [...]