Re: [PATCH] sched/fair: util_est: fast ramp-up EWMA on utilization increases
From: Vincent Guittot
Date: Sun Jun 30 2019 - 04:50:00 EST
On Fri, 28 Jun 2019 at 16:10, Patrick Bellasi <patrick.bellasi@xxxxxxx> wrote:
>
> On 28-Jun 15:51, Vincent Guittot wrote:
> > On Fri, 28 Jun 2019 at 14:38, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> > >
> > > On Fri, Jun 28, 2019 at 11:08:14AM +0100, Patrick Bellasi wrote:
> > > > On 26-Jun 13:40, Vincent Guittot wrote:
> > > > > Hi Patrick,
> > > > >
> > > > > On Thu, 20 Jun 2019 at 17:06, Patrick Bellasi <patrick.bellasi@xxxxxxx> wrote:
> > > > > >
> > > > > > The estimated utilization for a task is currently defined based on:
> > > > > > - enqueued: the utilization value at the end of the last activation
> > > > > > - ewma: an exponential moving average which samples are the enqueued values
> > > > > >
> > > > > > According to this definition, when a task suddenly change it's bandwidth
> > > > > > requirements from small to big, the EWMA will need to collect multiple
> > > > > > samples before converging up to track the new big utilization.
> > > > > >
> > > > > > Moreover, after the PELT scale invariance update [1], in the above scenario we
> > > > > > can see that the utilization of the task has a significant drop from the first
> > > > > > big activation to the following one. That's implied by the new "time-scaling"
> > > > >
> > > > > Could you give us more details about this? I'm not sure to understand
> > > > > what changes between the 1st big activation and the following one ?
> > > >
> > > > We are after a solution for the problem Douglas Raillard discussed at
> > > > OSPM, specifically the "Task util drop after 1st idle" highlighted in
> > > > slide 6 of his presentation:
> > > >
> > > > http://retis.sssup.it/ospm-summit/Downloads/02_05-Douglas_Raillard-How_can_we_make_schedutil_even_more_effective.pdf
> > > >
> > >
> > > So I see the problem, and I don't hate the patch, but I'm still
> > > struggling to understand how exactly it related to the time-scaling
> > > stuff. Afaict the fundamental problem here is layering two averages. The
> >
> > AFAICT, it's not related to the time-scaling
> >
> > In fact the big 1st activation happens because task runs at low OPP
> > and hasn't enough time to finish its running phase before the time to
> > begin the next one happens. This means that the task will run several
> > computations phase in one go which is no more a 75% task.
>
> But in that case, running multiple activations back to back, should we
> not expect the util_avg to exceed the 75% mark?
But task starts with a very low value and Pelt needs time to ramp up.
>
>
> > From a pelt PoV, the task is far larger than a 75% task and its
> > utilization too because it runs far longer (even after scaling time
> > with frequency).
>
> Which thus should match my expectation above, no?
But utilization has to ramp up before stabilizing to final value. The
value at the end of the 1st big activation is not what would be the
utilization if task was always that long
>
> > Once cpu reaches a high enough OPP that enable to have sleep phase
> > between each running phases, the task load tracking comes back to the
> > normal slope increase (the one that would have happen if task would
> > have jump from 5% to 75% but already running at max OPP)
>
>
> Indeed, I can see from the plots a change in slope. But there is also
> that big drop after the first big activation: 375 units in 1.1ms.
>
> Is that expected? I guess yes, since we fix the clock_pelt with the
> lost_idle_time.
>
>
> > > second (EWMA in our case) will always lag/delay the input of the first
> > > (PELT).
> > >
> > > The time-scaling thing might make matters worse, because that helps PELT
> > > ramp up faster, but that is not the primary issue.
> > >
> > > Or am I missing something?
>
> --
> #include <best/regards.h>
>
> Patrick Bellasi