Re: [PATCH v2] sched/fair: update scale invariance of PELT
From: Vincent Guittot
Date: Tue Apr 11 2017 - 08:08:53 EST
Le Tuesday 11 Apr 2017 à 12:41:36 (+0200), Peter Zijlstra a écrit :
> On Tue, Apr 11, 2017 at 11:40:21AM +0200, Vincent Guittot wrote:
> > Le Tuesday 11 Apr 2017 à 10:53:05 (+0200), Peter Zijlstra a écrit :
> > > On Tue, Apr 11, 2017 at 09:52:21AM +0200, Vincent Guittot wrote:
> > > > Le Monday 10 Apr 2017 à 19:38:02 (+0200), Peter Zijlstra a écrit :
> > > > >
> > > > > Thanks for the rebase.
> > > > >
> > > > > On Mon, Apr 10, 2017 at 11:18:29AM +0200, Vincent Guittot wrote:
> > > > >
> > > > > Ok, so let me try and paraphrase what this patch does.
> > > > >
> > > > > So consider a task that runs 16 out of our 32ms window:
> > > > >
> > > > > running idle
> > > > > |---------|---------|
>
> (A)
>
> > > > >
> > > > >
> > > > > You're saying that when we scale running with the frequency, suppose we
> > > > > were at 50% freq, we'll end up with:
> > > > >
> > > > > run idle
> > > > > |----|---------|
>
> (B)
>
> > > > >
> > > > >
> > > > > Which is obviously a shorter total then before; so what you do is add
> > > > > back the lost idle time like:
> > > > >
> > > > > run lost idle
> > > > > |----|----|---------|
>
> (C)
>
> > > > >
> > > > >
> > > > > to arrive at the same total time. Which seems to make sense.
> > > >
> > > > Yes
> > >
> > > OK, bear with me.
> > >
> > >
> > > So we have:
> > >
> > >
> > > util_sum' = utilsum * y^p +
> > >
> > > p-1
> > > d1 * y^p + 1024 * \Sum y^n + d3 * y^0
> > > n=1
> > >
> > > For the unscaled version, right?
> >
> > Yes for the running state.
> >
> > For sleeping state, it's just util_sum' = utilsum * y^p
>
> Sure, and from this is follows that for idle time we add 0, while we do
> decay. Lets call this (1).
>
> > >
> > >
> > >
> > > Now for the scaled version, instead of adding a full 'd1,d2,d3' running
> > > segments, we want to add partially running segments, where r=f*d/f_max,
> > > and lost segments l=d-r to fill out the idle time.
> > >
> > > But afaict we then end up with (F=f/f_max):
> > >
> > >
> > > util_sum' = utilsum * y^p +
> > >
> > > p-1
> > > F * d1 * y^p + F * 1024 * \Sum y^n + F * d3 * y^0
> > > n=1
> >
> > you also have a longer running time as it runs slower. We make the assumption that
> > d1+d2+d3 = (d1'+d2'+d3') * F
>
> No, d's stay the same length, r's are the scaled d, and l's the
> remainder, or lost idle time.
>
> That is; r + l = d, that way the time scale stays invariant as above (A)
> & (C).
>
> So if we run slower, we scale back r and l becomes !0.
>
> > If we consider that we cross a decay window, we still have the d1 to
> > complete the past one but then p'*F= p and d'3 will be the remaining
> > part of the current window and most probably not equal to d3
>
> So by doing r=Fd we get some (lost) idle time for every bit of runtime,
> equally distributed, as if the CPU inserted NOP cycles to lower the
> effective frequency.
>
> You want to explicitly place the idle time at the end? That would bias
> the sum downwards. To what point?
>
> > so we have with current invariance:
> >
> > util_sum' = utilsum * y^(p/F) +
> > (p/F - 1)
> > F * d1 * y^(p/F) + F * 1024 * \Sum y^n + F * d'3 * y^0
> > n=1
>
> No, we don't have p/F. p is very much _NOT_ scaled.
ok. so confusion may come from that we don't have the same meaning of p and I have skipped important intermediate formula.
I'm going to continue on the new fresh thread
>
> Look at accumulate_sum(), we compute p from d, not r.
>
> > >
> > > And we can collect the common term F:
> > >
> > > util_sum' = utilsum * y^p +
> > >
> > > p-1
> > > F * (d1 * y^p + 1024 * \Sum y^n + d3 * y^0)
> > > n=1
> > >
> > >
> > > Which is exactly what we already did.
> >
> > In the new invariance scale, the F is applied on p not on the contribution
> > value
>
> That's wrong... That would result in (B) not (C).