Re: [PATCH v2 2/2] cpufreq: intel_pstate: Conditional frequency invariant accounting

From: Rafael J. Wysocki
Date: Fri Oct 04 2019 - 04:09:06 EST


On Fri, Oct 4, 2019 at 5:31 AM Srinivas Pandruvada
<srinivas.pandruvada@xxxxxxxxxxxxxxx> wrote:
>
> On Thu, 2019-10-03 at 20:05 +0200, Rafael J. Wysocki wrote:
> > On Wednesday, October 2, 2019 2:29:26 PM CEST Giovanni Gherdovich
> > wrote:
> > > From: Srinivas Pandruvada <srinivas.pandruvada@xxxxxxxxxxxxxxx>
> > >
> > > intel_pstate has two operating modes: active and passive. In
> > > "active"
> > > mode, the in-built scaling governor is used and in "passive" mode,
> > > the driver can be used with any governor like "schedutil". In
> > > "active"
> > > mode the utilization values from schedutil is not used and there is
> > > a requirement from high performance computing use cases, not to
> > > readas well
> > > any APERF/MPERF MSRs.
> >
> > Well, this isn't quite convincing.
> >
> > In particular, I don't see why the "don't read APERF/MPERF MSRs"
> > argument
> > applies *only* to intel_pstate in the "active" mode. What about
> > intel_pstate
> > in the "passive" mode combined with the "performance" governor? Or
> > any other
> > governor different from "schedutil" for that matter?
> >
> > And what about acpi_cpufreq combined with any governor different from
> > "schedutil"?
> >
> > Scale invariance is not really needed in all of those cases right now
> > AFAICS,
> > or is it?
>
> Correct. This is just part of the patch to disable in active mode
> (particularly in HWP and performance mode).
>
> But this patch is 2 years old. The folks who wanted this, disable
> intel-pstate and use userspace governor with acpi-cpufreq. So may be
> better to address those cases too.

Well, that's my point. :-)

It looks like the scale invariance is only needed when the schedutil
governor is used, regardless of the driver, and it may lead to
performance degradation in the other cases, at least in principle (I
wonder, though, if any hard data supporting that claim are available).
That can be addressed in two ways IMO, either by reducing the possible
negative impact of the scale invariance code (eg. by running it less
frequently), so that it can be always enabled (as long as it is
supported by the processor), or by avoiding to run it in all cases
when it is not needed (but that basically would require the governor
->init and ->exit to enable and disable the scale invariance,
respectively).

> >
> > So is the real concern that intel_pstate in the "active" mode reads
> > the MPERF
> > and APERF MSRs by itself and that kind of duplicates what the scale
> > invariance
> > code does and is redundant etc?
> It is redundant in non-HWP mode. In HWP and performance (active mode)
> we don't use atleast at this time.

Right.