Re: [PATCH 1/2] x86,sched: Add support for frequency invariance

From: Srinivas Pandruvada
Date: Thu Sep 19 2019 - 19:55:34 EST


On Tue, 2019-09-17 at 16:27 +0200, Giovanni Gherdovich wrote:
> Hello Srinivas,
>
> On Fri, 2019-09-13 at 15:52 -0700, Srinivas Pandruvada wrote:
> > On Mon, 2019-09-09 at 04:42 +0200, Giovanni Gherdovich wrote:
> >
> > ...
> >
> > > +
> > > +/*
> > > + * APERF/MPERF frequency ratio computation.
> > > + *
> > > + * The scheduler wants to do frequency invariant accounting and
> > > needs a <1
> > > + * ratio to account for the 'current' frequency, corresponding
> > > to
> > > + * freq_curr / freq_max.
> >
> > I thought this is no longer the restriction and Vincent did some
> > work
> > to remove this restriction.
>
> If you're referring to the patch
>
> 23127296889f "sched/fair: Update scale invariance of PELT"
>
> merged in v5.2, I'm familiar with that and from my understanding you
> still
> want a <1 scaling factor. This is my recalling of the patch:
>
> Vincent was studying some synthetic traces and realized that util_avg
> reported
> by PELT didn't quite match the result you'd get computing the formula
> with pen
> and paper (theoretical value). To address this he changed where the
> scaling
> factor is applied in the PELT formula.
>
> At some point when accumulating the PELT sums, you'll have to measure
> the time
> 'delta' since you last updated PELT. What we have after Vincent's
> change is
> that this time length 'delta' gets itself scaled by the
> freq_curr/freq_max
> ratio:
>
> delta = time since last PELT update
> delta *= freq_percent
>
> In this way time goes at "wall clock speed" only when you're running
> at max
> capacitiy, and goes "slower" (from the PELT point of view) if we're
> running at
> a lower frequency. I don't think Vincent had in mind a faster-than-
> wall-clock
> PELT time (which you'd get w/ freq_percent>1).
>
> Speaking of which, Srinivas, do you have any opinion and/or
> requirement about
> this? I confusely remember Peter Zijlstra saying (more than a year
> ago, now)
> that you would like an unclipped freq_curr/freq_max ratio, and may
> not be
> happy with this patch clipping it to 1 when freq_curr >
> 4_cores_turbo. If
> that's the case, could you elaborate on this?
> Ignore that if it doesn't make sense, I may be mis-remembering.
I was thinking of power efficiency use case particularly for Atom like
platforms, 1C max as you observed is more efficient.

But now sched deadline code is using arch_scale_freq_capacity(() to
calculate dl_se->runtime, where closer to deterministic value with all
cores, may be better, which will be scaled with base_freq.

Thanks,
Srinivas