[PATCH 1/2] cpufreq: intel_pstate: Avoid missing HWP max updates in passive mode

From: Rafael J. Wysocki
Date: Thu Oct 22 2020 - 07:57:59 EST


From: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx>

If the cpufreq policy max limit is changed when intel_pstate operates
in the passive mode with HWP enabled and the "powersave" governor is
used on top of it, the HWP max limit is not updated as appropriate.

Namely, in the "powersave" governor case, the target P-state
is always equal to the policy min limit, so if the latter does
not change, the "target_freq == policy->cur" check in
__cpufreq_driver_target() is "true" and the
"target_pstate != old_pstate" check in intel_cpufreq_update_pstate()
is "false", so intel_cpufreq_adjust_hwp() is not invoked to update
the HWP Request MSR and the HWP max limit is not updated as a result.

To prevent that from occurring, modify __cpufreq_driver_target()
to do the "target_freq == policy->cur" check only in the frequency
table case and change intel_cpufreq_update_pstate() to do the
"target_pstate != old_pstate" check only in the non-HWP case and
let intel_cpufreq_adjust_hwp() always run in the HWP case (it will
update HWP Request only if the cached value of the register is
different from the new one including the limits, so if neither the
target P-state value nor the max limit changes, the register write
will still be avoided).

Fixes: f6ebbcf08f37 ("cpufreq: intel_pstate: Implement passive mode with HWP enabled")
Reported-by: Zhang Rui <rui.zhang@xxxxxxxxx>
Cc: 5.9+ <stable@xxxxxxxxxxxxxxx> # 5.9+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx>
---
drivers/cpufreq/cpufreq.c | 6 +++---
drivers/cpufreq/intel_pstate.c | 12 +++++-------
2 files changed, 8 insertions(+), 10 deletions(-)

Index: linux-pm/drivers/cpufreq/intel_pstate.c
===================================================================
--- linux-pm.orig/drivers/cpufreq/intel_pstate.c
+++ linux-pm/drivers/cpufreq/intel_pstate.c
@@ -2550,14 +2550,12 @@ static int intel_cpufreq_update_pstate(s
int old_pstate = cpu->pstate.current_pstate;

target_pstate = intel_pstate_prepare_request(cpu, target_pstate);
- if (target_pstate != old_pstate) {
+ if (hwp_active) {
+ intel_cpufreq_adjust_hwp(cpu, target_pstate, fast_switch);
+ cpu->pstate.current_pstate = target_pstate;
+ } else if (target_pstate != old_pstate) {
+ intel_cpufreq_adjust_perf_ctl(cpu, target_pstate, fast_switch);
cpu->pstate.current_pstate = target_pstate;
- if (hwp_active)
- intel_cpufreq_adjust_hwp(cpu, target_pstate,
- fast_switch);
- else
- intel_cpufreq_adjust_perf_ctl(cpu, target_pstate,
- fast_switch);
}

intel_cpufreq_trace(cpu, fast_switch ? INTEL_PSTATE_TRACE_FAST_SWITCH :
Index: linux-pm/drivers/cpufreq/cpufreq.c
===================================================================
--- linux-pm.orig/drivers/cpufreq/cpufreq.c
+++ linux-pm/drivers/cpufreq/cpufreq.c
@@ -2182,6 +2182,9 @@ int __cpufreq_driver_target(struct cpufr
pr_debug("target for CPU %u: %u kHz, relation %u, requested %u kHz\n",
policy->cpu, target_freq, relation, old_target_freq);

+ if (cpufreq_driver->target)
+ return cpufreq_driver->target(policy, target_freq, relation);
+
/*
* This might look like a redundant call as we are checking it again
* after finding index. But it is left intentionally for cases where
@@ -2194,9 +2197,6 @@ int __cpufreq_driver_target(struct cpufr
/* Save last value to restore later on errors */
policy->restore_freq = policy->cur;

- if (cpufreq_driver->target)
- return cpufreq_driver->target(policy, target_freq, relation);
-
if (!cpufreq_driver->target_index)
return -EINVAL;