[PATCH] powercap: intel_rapl: Fix setting of Power Limit 4 to 0

From: Srinivas Pandruvada
Date: Tue Sep 05 2023 - 20:07:07 EST


System runs at minimum performance, once powercap RAPL package domain
"enabled" flag is toggled.

Setting RAPL package domain enabled flag to 0, results in setting of
power limit 4 (PL4) MSR 0x601 to 0. This implies disabling PL4 limit.
The PL4 limit controls the peak power. This can significantly change
the performance. Even worse, when the enabled flag is set to 1 again.
This will set PL4 MSR value to 0x01, which means reduce peak power to
0.125W. This will force the system to run at the lowest possible
performance.

This is caused by a change which assumes that there is an enable bit
in the PL4 MSR like other power limits.

In functions set_floor_freq_default() and rapl_remove_package(), call
rapl_write_pl_data with PL_ENABLE and PL_CLAMP for only power limit 1
and 2. Similarly don't read PL_ENABLE for PL4 to check the presence of
power limit 4. Power limit 4 support is based on CPU model in this
driver. No additional checks can be done.

Fixes: 9050a9cd5e4c ("powercap: intel_rapl: Cleanup Power Limits support")
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@xxxxxxxxxxxxxxx>
Cc: stable@xxxxxxxxxxxxxxx # v6.5+
---
drivers/powercap/intel_rapl_common.c | 14 ++++----------
1 file changed, 4 insertions(+), 10 deletions(-)

diff --git a/drivers/powercap/intel_rapl_common.c b/drivers/powercap/intel_rapl_common.c
index 5c2e6d5eea2a..0afedf7ad872 100644
--- a/drivers/powercap/intel_rapl_common.c
+++ b/drivers/powercap/intel_rapl_common.c
@@ -184,8 +184,6 @@ static int get_pl_prim(struct rapl_domain *rd, int pl, enum pl_prims prim)
case POWER_LIMIT4:
if (prim == PL_LIMIT)
return POWER_LIMIT4;
- if (prim == PL_ENABLE)
- return PL4_ENABLE;
/* PL4 would be around two times PL2, use same prim as PL2. */
if (prim == PL_MAX_POWER)
return MAX_POWER;
@@ -1033,17 +1031,13 @@ static void package_power_limit_irq_restore(struct rapl_package *rp)

static void set_floor_freq_default(struct rapl_domain *rd, bool mode)
{
- int i;
-
/* always enable clamp such that p-state can go below OS requested
* range. power capping priority over guranteed frequency.
*/
rapl_write_pl_data(rd, POWER_LIMIT1, PL_CLAMP, mode);

- for (i = POWER_LIMIT2; i < NR_POWER_LIMITS; i++) {
- rapl_write_pl_data(rd, i, PL_ENABLE, mode);
- rapl_write_pl_data(rd, i, PL_CLAMP, mode);
- }
+ rapl_write_pl_data(rd, POWER_LIMIT2, PL_ENABLE, mode);
+ rapl_write_pl_data(rd, POWER_LIMIT2, PL_CLAMP, mode);
}

static void set_floor_freq_atom(struct rapl_domain *rd, bool enable)
@@ -1458,7 +1452,7 @@ static void rapl_detect_powerlimit(struct rapl_domain *rd)
}
}

- if (rapl_read_pl_data(rd, i, PL_ENABLE, false, &val64))
+ if (i != POWER_LIMIT4 && rapl_read_pl_data(rd, i, PL_ENABLE, false, &val64))
rd->rpl[i].name = NULL;
}
}
@@ -1510,7 +1504,7 @@ void rapl_remove_package(struct rapl_package *rp)
for (rd = rp->domains; rd < rp->domains + rp->nr_domains; rd++) {
int i;

- for (i = POWER_LIMIT1; i < NR_POWER_LIMITS; i++) {
+ for (i = POWER_LIMIT1; i <= POWER_LIMIT2; i++) {
rapl_write_pl_data(rd, i, PL_ENABLE, 0);
rapl_write_pl_data(rd, i, PL_CLAMP, 0);
}
--
2.34.1