Re: [PATCH v4 2/6] x86,sched: Add support for frequency invariance on SKYLAKE_X

From: Giovanni Gherdovich
Date: Thu Dec 19 2019 - 15:23:32 EST


On Wed, 2019-12-18 at 21:06 +0100, Peter Zijlstra wrote:
> On Wed, Nov 13, 2019 at 01:46:50PM +0100, Giovanni Gherdovich wrote:
> > The scheduler needs the ratio freq_curr/freq_max for frequency-invariant
> > accounting. On SKYLAKE_X CPUs set freq_max to the highest frequency that can
> > be sustained by a group of at least 4 cores.
> >
> > From the changelog of commit 31e07522be56 ("tools/power turbostat: fix
> > decoding for GLM, DNV, SKX turbo-ratio limits"):
> >
> > > Newer processors do not hard-code the the number of cpus in each bin
> > > to {1, 2, 3, 4, 5, 6, 7, 8} Rather, they can specify any number
> > > of CPUS in each of the 8 bins:
> > >
> > > eg.
> > >
> > > ...
> > > 37 * 100.0 = 3600.0 MHz max turbo 4 active cores
> > > 38 * 100.0 = 3700.0 MHz max turbo 3 active cores
> > > 39 * 100.0 = 3800.0 MHz max turbo 2 active cores
> > > 39 * 100.0 = 3900.0 MHz max turbo 1 active cores
> > >
> > > could now look something like this:
> > >
> > > ...
> > > 37 * 100.0 = 3600.0 MHz max turbo 16 active cores
> > > 38 * 100.0 = 3700.0 MHz max turbo 8 active cores
> > > 39 * 100.0 = 3800.0 MHz max turbo 4 active cores
> > > 39 * 100.0 = 3900.0 MHz max turbo 2 active cores
> >
> > This encoding of turbo levels applies to both SKYLAKE_X and GOLDMONT/GOLDMONT_D,
> > but we treat these two classes in separate commits because their freq_max
> > values need to be different. For SKX we prefer a lower freq_max in the ratio
> > freq_curr/freq_max, allowing load and utilization to overshoot and the
> > schedutil governor to be more performance-oriented. Models from the Atom
> > series (such as GOLDMONT*) are handled in a forthcoming commit as they have to
> > favor power-efficiency over performance.
>
> Can we at least use a single function to decode both? A little like the
> below. I'm not married to the naming, but I think it is a little silly
> to have 2 different functions to decode the exact same MSRs.
>
> (one could even go as far as to make a boot param to override the {1,4}
> default core count for these things)

Sure, that was actually a gross oversight on my part for not seeing that.
Thanks for catching it and sketching a solution.

Giovanni