Re: [PATCH v2] sched/pelt: sync util/runnable_sum with PELT window when propagating

From: Vincent Guittot
Date: Tue May 19 2020 - 11:41:49 EST


On Tue, 19 May 2020 at 12:28, Dietmar Eggemann <dietmar.eggemann@xxxxxxx> wrote:
>
> On 06/05/2020 17:53, Vincent Guittot wrote:
>
> [...]
>
> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > index 02f323b85b6d..df3923a65162 100644
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -3441,52 +3441,46 @@ static inline void
> > update_tg_cfs_util(struct cfs_rq *cfs_rq, struct sched_entity *se, struct cfs_rq *gcfs_rq)
> > {
> > long delta = gcfs_rq->avg.util_avg - se->avg.util_avg;
> > + /*
> > + * cfs_rq->avg.period_contrib can be used for both cfs_rq and se.
> > + * See ___update_load_avg() for details.
> > + */
> > + u32 divider = LOAD_AVG_MAX - 1024 + cfs_rq->avg.period_contrib;
>
> Why not doing the assignment (like in update_tg_cfs_load()) after the
> next condition? Same question for update_tg_cfs_runnable().

In fact, I expect the compiler to be smart enough to do this at the best place

>
> [...]
>
> > static inline void
> > update_tg_cfs_runnable(struct cfs_rq *cfs_rq, struct sched_entity *se, struct cfs_rq *gcfs_rq)
> > {
> > long delta = gcfs_rq->avg.runnable_avg - se->avg.runnable_avg;
> > + /*
> > + * cfs_rq->avg.period_contrib can be used for both cfs_rq and se.
> > + * See ___update_load_avg() for details.
> > + */
> > + u32 divider = LOAD_AVG_MAX - 1024 + cfs_rq->avg.period_contrib;
>
> We know have 6 assignments like this in fair.c and 1 in pelt.c. Could
> this not be refactored by using something like this in pelt.h:
>
> +static inline u32 get_divider(struct sched_avg *avg)

That's a good point
I would add a pelt in the name like
static inline u32 get_pelt_divider(struct sched_avg *avg)

> +{
> + return LOAD_AVG_MAX - 1024 + avg->period_contrib;
> +}
>
> [...]
>
> > diff --git a/kernel/sched/pelt.c b/kernel/sched/pelt.c
> > index b647d04d9c8b..1feff80e7e45 100644
> > --- a/kernel/sched/pelt.c
> > +++ b/kernel/sched/pelt.c
> > @@ -237,6 +237,30 @@ ___update_load_sum(u64 now, struct sched_avg *sa,
> > return 1;
> > }
> >
> > +/*
> > + * When syncing *_avg with *_sum, we must take into account the current
> > + * position in the PELT segment otherwise the remaining part of the segment
> > + * will be considered as idle time whereas it's not yet elapsed and this will
> > + * generate unwanted oscillation in the range [1002..1024[.
> > + *
> > + * The max value of *_sum varies with the position in the time segment and is
> > + * equals to :
> > + *
> > + * LOAD_AVG_MAX*y + sa->period_contrib
> > + *
> > + * which can be simplified into:
> > + *
> > + * LOAD_AVG_MAX - 1024 + sa->period_contrib
> > + *
> > + * because LOAD_AVG_MAX*y == LOAD_AVG_MAX-1024
>
> Isn't this rather '~' instead of '==', even for y^32 = 0.5 ?
>
> 47742 * 0.5^(1/32) ~ 47742 - 1024

With integer precision and the runnable_avg_yN_inv array, you've got
exactly 1024

>
>
> Apart from that, LGTM
>
> Reviewed-by: Dietmar Eggemann <dietmar.eggemann@xxxxxxx>