Re: [RFC/RFT] [PATCH 02/10] cpufreq: intel_pstate: Conditional frequency invariant accounting

From: Srinivas Pandruvada
Date: Wed May 16 2018 - 11:26:08 EST


On Wed, 2018-05-16 at 17:19 +0200, Juri Lelli wrote:
> On 15/05/18 21:49, Srinivas Pandruvada wrote:
> > intel_pstate has two operating modes: active and passive. In
> > "active"
> > mode, the in-built scaling governor is used and in "passive" mode,
> > the driver can be used with any governor like "schedutil". In
> > "active"
> > mode the utilization values from schedutil is not used and there is
> > a requirement from high performance computing use cases, not to
> > read
> > any APERF/MPERF MSRs. In this case no need to use CPU cycles for
> > frequency invariant accounting by reading APERF/MPERF MSRs.
> > With this change frequency invariant account is only enabled in
> > "passive" mode.
> >
> > Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@xxxxxxxxxxx
> > .com>
> > ---
> > [Note: The tick will be enabled later in the series when hwp
> > dynamic
> > boost is enabled]
> >
> > drivers/cpufreq/intel_pstate.c | 5 +++++
> > 1 file changed, 5 insertions(+)
> >
> > diff --git a/drivers/cpufreq/intel_pstate.c
> > b/drivers/cpufreq/intel_pstate.c
> > index 17e566af..f686bbe 100644
> > --- a/drivers/cpufreq/intel_pstate.c
> > +++ b/drivers/cpufreq/intel_pstate.c
> > @@ -2040,6 +2040,8 @@ static int
> > intel_pstate_register_driver(struct cpufreq_driver *driver)
> > {
> > int ret;
> >
> > + x86_arch_scale_freq_tick_disable();
> > +
> > memset(&global, 0, sizeof(global));
> > global.max_perf_pct = 100;
> >
> > @@ -2052,6 +2054,9 @@ static int
> > intel_pstate_register_driver(struct cpufreq_driver *driver)
> >
> > global.min_perf_pct = min_perf_pct_min();
> >
> > + if (driver == &intel_cpufreq)
> > + x86_arch_scale_freq_tick_enable();
>
> This will unconditionally trigger the reading/calculation at each
> tick
> even though information is not actually consumed (e.g., running
> performance or any other governor), right? Do we want that?
Good point. I should call x86_arch_scale_freq_tick_disable() in
performance mode switch for active mode.

Thanks,
Srinivas

>
> Anyway, FWIW I started testing this on a E5-2609 v3 and I'm not
> seeing
> hackbench regressions so far (running with schedutil governor).
>
> Best,
>
> - Juri