Re: [PATCH 4/7 v3] sched: propagate load during synchronous attach/detach

From: Peter Zijlstra
Date: Thu Sep 15 2016 - 11:14:31 EST


On Thu, Sep 15, 2016 at 02:11:49PM +0100, Dietmar Eggemann wrote:
> On 12/09/16 08:47, Vincent Guittot wrote:

> > +/* Take into account change of load of a child task group */
> > +static inline void
> > +update_tg_cfs_load(struct cfs_rq *cfs_rq, struct sched_entity *se)
> > +{
> > + struct cfs_rq *gcfs_rq = group_cfs_rq(se);
> > + long delta, load = gcfs_rq->avg.load_avg;
> > +
> > + /* If the load of group cfs_rq is null, the load of the
> > + * sched_entity will also be null so we can skip the formula
> > + */
> > + if (load) {
> > + long tg_load;
> > +
> > + /* Get tg's load and ensure tg_load > 0 */
> > + tg_load = atomic_long_read(&gcfs_rq->tg->load_avg) + 1;
> > +
> > + /* Ensure tg_load >= load and updated with current load*/
> > + tg_load -= gcfs_rq->tg_load_avg_contrib;
> > + tg_load += load;
> > +
> > + /* scale gcfs_rq's load into tg's shares*/
> > + load *= scale_load_down(gcfs_rq->tg->shares);
> > + load /= tg_load;
> > +
> > + /*
> > + * we need to compute a correction term in the case that the
> > + * task group is consuming <1 cpu so that we would contribute
> > + * the same load as a task of equal weight.
>
> Wasn't 'consuming <1' related to 'NICE_0_LOAD' and not
> scale_load_down(gcfs_rq->tg->shares) before the rewrite of PELT (v4.2,
> __update_group_entity_contrib())?


So the approximation was: min(1, runnable_avg) * shares;

And it just so happened that we tracked runnable_avg in 10 bit fixed
point, which then happened to be NICE_0_LOAD.

But here we have load_avg, which already includes a '* shares' factor.
So that then becomes min(shares, load_avg).

We did however loose a lot on why and how min(1, runnable_avg) is a
sensible thing to do...

> > + */
> > + if (tg_load < scale_load_down(gcfs_rq->tg->shares)) {
> > + load *= tg_load;
> > + load /= scale_load_down(gcfs_rq->tg->shares);
> > + }
> > + }
>
> [...]