Re: [RFT][Update][PATCH 2/2] cpufreq: intel_pstate: Update max CPU frequency on global turbo changes
From: Rafael J. Wysocki
Date: Wed Mar 06 2019 - 05:06:03 EST
On Tue, Mar 5, 2019 at 6:37 PM Quentin Perret <quentin.perret@xxxxxxx> wrote:
>
> On Tuesday 05 Mar 2019 at 18:02:25 (+0100), Rafael J. Wysocki wrote:
> > But that 128 needs to be compared to
> >
> > (SCHED_CAPACITY_SCALE * cpuinfo.min_freq) / cpuinfo.max_freq
> >
> > so with SCHED_CAPACITY_SCALE equal to 1024 this means max_freq 8x
> > higher than min_freq. That is not totally unreasonable IMO and
> > because sg_cpu->iowait_boost grows exponentially, the difference
> > between 8x and, say, 4x is one iteration.
> >
> > > The first steps will all be below the min freq, so they'll just be
> > > transparent, while right now the iowait boost kicks in much faster :/
> >
> > There can be one iteration of a difference this way or that way AFAICS
> > and I'm not even sure how much of a performance difference that makes
> > in practice.
>
> Yeah I don't expect that to have a huge impact TBH but it'd be nice to
> actually get numbers to verify that, that's all I'm saying :-)
>
> You have 'funny' platforms like Juno r0 out there where the min/max
> frequencies are 450MHz/850Mhz. In this case, starting from 128 you'll
> need 3 wake-ups to reach what is currently the starting point. I'm not
> sure if the impact is visible or not, but it's worth checking.
Please recall that the iowait boosting algo was different to start
with, though: it jumped to the max right away and then backed off.
That turned out to be overly aggressive in general and led to some
undesired "jittery" behavior, which is why it was changed.
Thus it looks like the platforms on which it still jumps to the max
right away may actually benefit from changing it to more steps. :-)
In turn, the platforms where it takes more than 3 steps for the boost
to get to the max would get a slight performance improvement from this
changes and I'm not sure why that could be bad.
Moreover, it didn't depend on the min originally, just on the max and
just because I wanted the number of backoff steps needed to go back
down to zero to be independent of the platform IIRC. The dependency
on the min is sort of artificial here and leads to arbitrary
differences in behavior between different platforms which isn't
particularly fortunate.
With all of that in mind, I would go right away to making the boost
independent of min and max (the final number of steps to reach the max
is TBD in practice, but 3 looks like a good enough compromise to me).
FWIW, the internal governor in intel_pstate has been changed along
these lines already (the changes is waiting for Linus to pull it ATM).