Re: [PATCH 2/2][RFC v2] ACPI: Update cpuinfo.max after bootup if necessary
From: Rafael J. Wysocki
Date: Sun Mar 03 2019 - 12:04:45 EST
On Sat, Mar 2, 2019 at 10:55 AM Yu Chen <yu.c.chen@xxxxxxxxx> wrote:
>
> On Thu, Feb 28, 2019 at 11:56:48PM +0100, Rafael J. Wysocki wrote:
> > On Thu, Feb 28, 2019 at 6:59 PM Chen Yu <yu.c.chen@xxxxxxxxx> wrote:
> > >
> > > On Dell Inc. XPS13 9333, the BIOS changes the value of
> > > MSR_IA32_MISC_ENABLE_TURBO_DISABLE at runtime (e.g., when
> > > the power source changes), the maximum frequency of the
> > > CPU is not updated accordingly. This is due to the policy's
> > > cpuinfo.max is not updated when _PPC notifier fires.
> > >
> > > Fix this problem by updating the policy's cpuinfo.max when
> > > necessary.
> > >
> > > Link: https://bugzilla.kernel.org/show_bug.cgi?id=200759
> > > Reported-and-tested-by: Gabriele Mazzotta <gabriele.mzt@xxxxxxxxx>
> > > Originally-by: Srinivas Pandruvada <srinivas.pandruvada@xxxxxxxxxxxxxxx>
> > > Signed-off-by: Chen Yu <yu.c.chen@xxxxxxxxx>
> > > ---
> > > drivers/cpufreq/cpufreq.c | 2 ++
> > > drivers/cpufreq/intel_pstate.c | 15 +++++++++++++--
> > > 2 files changed, 15 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> > > index e35a886e00bc..95e08816b512 100644
> > > --- a/drivers/cpufreq/cpufreq.c
> > > +++ b/drivers/cpufreq/cpufreq.c
> > > @@ -2237,6 +2237,8 @@ static int cpufreq_set_policy(struct cpufreq_policy *policy,
> > >
> > > policy->min = new_policy->min;
> > > policy->max = new_policy->max;
> > > + policy->cpuinfo.max_freq = new_policy->cpuinfo.max_freq;
> > > + policy->cpuinfo.min_freq = new_policy->cpuinfo.min_freq;
> > > trace_cpu_frequency_limits(policy);
> > >
> > > policy->cached_target_freq = UINT_MAX;
> > > diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c
> > > index dd66decf2087..841c74f69f66 100644
> > > --- a/drivers/cpufreq/intel_pstate.c
> > > +++ b/drivers/cpufreq/intel_pstate.c
> > > @@ -2081,11 +2081,17 @@ static void intel_pstate_adjust_policy_max(struct cpufreq_policy *policy,
> > >
> > > static int intel_pstate_verify_policy(struct cpufreq_policy *policy)
> > > {
> > > + int max_freq;
> > > struct cpudata *cpu = all_cpu_data[policy->cpu];
> > >
> > > update_turbo_state();
> > > + max_freq = intel_pstate_get_max_freq(cpu);
> > > +
> > > + if (acpi_ppc && max_freq != policy->cpuinfo.max_freq)
> > > + policy->cpuinfo.max_freq = policy->max = max_freq;
> >
> > Updating cpuinfo.max_freq here only causes the current limit to be
> > reported via sysfs, because intel_pstate doesn't actually use
> > cpuinfo.max_freq for anything and the core doesn't enforce it as a
> > limit.
> >
> > All of the computations in the active mode in the driver actually use
> > the current limit anyway AFAICS.
> >
> Yes, but it looks like we should also take care of the cpuinfo.max
> if it changes, I searched the code, it seems that other components
> might use the policy's cpuinfo.max for some purpose. They might use
> cpufreq_cpu_get() to get the policy, and use the cpuinfo.max_freq
> directly, no matter what the mode intel_pstate is in. Such as kvm
> might use it as the max tsc khz if the tsc is not constant. And i915
> might consider the cpuinfo.max_freq to adjust the IA frequency on
> gen6 platforms. So changing cpuinfo.max might also impact not only
> cpufreq but also other components too.
> > > +
> > > cpufreq_verify_within_limits(policy, policy->cpuinfo.min_freq,
> > > - intel_pstate_get_max_freq(cpu));
> > > + max_freq);
> > >
> > > if (policy->policy != CPUFREQ_POLICY_POWERSAVE &&
> > > policy->policy != CPUFREQ_POLICY_PERFORMANCE)
> > > @@ -2192,11 +2198,16 @@ static struct cpufreq_driver intel_pstate = {
> > >
> > > static int intel_cpufreq_verify_policy(struct cpufreq_policy *policy)
> > > {
> > > + int max_freq;
> > > struct cpudata *cpu = all_cpu_data[policy->cpu];
> > >
> > > update_turbo_state();
> > > + max_freq = intel_pstate_get_max_freq(cpu);
> > > +
> > > + if (acpi_ppc && max_freq != policy->cpuinfo.max_freq)
> > > + policy->cpuinfo.max_freq = policy->max = max_freq;
> >
> > In this case (passive mode) updating cpuinfo.max_freq will actually
> > cause governors to use the new value in computations, so the P-state
> > selection will work somewhat differently, but that isn't really
> > consistent with what acpi-cpufreq does and with setting no_turbo in
> > the intel_pstate sysfs to 1 without this patch.
> >
> > With acpi-cpufreq cpuinfo.max_freq is always the max frequency in the
> > _PSS table regardless of what the _PSS limit is. Also setting
> > no_turbo to 1 in intel_pstate without this patch doesn't cause
> > cpuinfo.max_freq to change and I don't really think that it should.
> >
> > I would be tempted to always initialize cpuinfo.max_freq to the max
> > turbo frequency, but there is a concern about systems in which
> > MSR_IA32_MISC_ENABLE_TURBO_DISABLE is never set on the fly (just in
> > the BIOS setup as it should be) and where it doesn't make sense to
> > consider turbo frequencies at all.
> Ok, maybe we can check the bit during boot(consider BIOS's setting)?
> >
> > > cpufreq_verify_within_limits(policy, policy->cpuinfo.min_freq,
> > > - intel_pstate_get_max_freq(cpu));
> > > + max_freq);
> > >
> > > intel_pstate_adjust_policy_max(policy, cpu);
> > >
> > > --
> >
> > It looks like I need to think about this a bit more.
> Ok, I'll test the patch you sent out.
Please do.