[PATCH 4.9 01/60] cpufreq: intel_pstate: Disable energy efficiency optimization

From: Greg Kroah-Hartman
Date: Mon Feb 13 2017 - 08:04:43 EST

4.9-stable review patch. If anyone has any objections, please let me know.


From: Srinivas Pandruvada <srinivas.pandruvada@xxxxxxxxxxxxxxx>

commit 6e978b22efa1db9f6e71b24440b5f1d93e968ee3 upstream.

Some Kabylake desktop processors may not reach max turbo when running in
HWP mode, even if running under sustained 100% utilization.

This occurs when the HWP.EPP (Energy Performance Preference) is set to
"balance_power" (0x80) -- the default on most systems.

It occurs because the platform BIOS may erroneously enable an
energy-efficiency setting -- MSR_IA32_POWER_CTL BIT-EE, which is not
recommended to be enabled on this SKU.

On the failing systems, this BIOS issue was not discovered when the
desktop motherboard was tested with Windows, because the BIOS also
neglects to provide the ACPI/CPPC table, that Windows requires to enable
HWP, and so Windows runs in legacy P-state mode, where this setting has
no effect.

Linux' intel_pstate driver does not require ACPI/CPPC to enable HWP, and
so it runs in HWP mode, exposing this incorrect BIOS configuration.

There are several ways to address this problem.

First, Linux can also run in legacy P-state mode on this system.
As intel_pstate is how Linux enables HWP, booting with
will run in acpi-cpufreq/ondemand legacy p-state mode.

Or second, the "performance" governor can be used with intel_pstate,
which will modify HWP.EPP to 0.

Or third, starting in 4.10, the
attribute in can be updated from "balance_power" to "performance".

Or fourth, apply this patch, which fixes the erroneous setting of
MSR_IA32_POWER_CTL BIT_EE on this model, allowing the default
configuration to function as designed.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@xxxxxxxxxxxxxxx>
Reviewed-by: Len Brown <len.brown@xxxxxxxxx>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx>
Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>

drivers/cpufreq/intel_pstate.c | 30 ++++++++++++++++++++++++++++++
1 file changed, 30 insertions(+)

--- a/drivers/cpufreq/intel_pstate.c
+++ b/drivers/cpufreq/intel_pstate.c
@@ -820,6 +820,25 @@ static void intel_pstate_hwp_enable(stru
wrmsrl_on_cpu(cpudata->cpu, MSR_PM_ENABLE, 0x1);

+#define MSR_IA32_POWER_CTL_BIT_EE 19
+/* Disable energy efficiency optimization */
+static void intel_pstate_disable_ee(int cpu)
+ u64 power_ctl;
+ int ret;
+ ret = rdmsrl_on_cpu(cpu, MSR_IA32_POWER_CTL, &power_ctl);
+ if (ret)
+ return;
+ if (!(power_ctl & BIT(MSR_IA32_POWER_CTL_BIT_EE))) {
+ pr_info("Disabling energy efficiency optimization\n");
+ power_ctl |= BIT(MSR_IA32_POWER_CTL_BIT_EE);
+ wrmsrl_on_cpu(cpu, MSR_IA32_POWER_CTL, power_ctl);
+ }
static int atom_get_min_pstate(void)
u64 value;
@@ -1420,6 +1439,11 @@ static const struct x86_cpu_id intel_pst

+static const struct x86_cpu_id intel_pstate_cpu_ee_disable_ids[] = {
+ {}
static int intel_pstate_init_cpu(unsigned int cpunum)
struct cpudata *cpu;
@@ -1435,6 +1459,12 @@ static int intel_pstate_init_cpu(unsigne
cpu->cpu = cpunum;

if (hwp_active) {
+ const struct x86_cpu_id *id;
+ id = x86_match_cpu(intel_pstate_cpu_ee_disable_ids);
+ if (id)
+ intel_pstate_disable_ee(cpunum);
pid_params.sample_rate_ms = 50;
pid_params.sample_rate_ns = 50 * NSEC_PER_MSEC;