Re: [PATCH] powercap: intel_rapl: Fix setting of Power Limit 4 to 0

From: Zhang, Rui
Date: Tue Sep 05 2023 - 22:20:43 EST


Hi, Srinivas,

Thanks for the patch.

On Tue, 2023-09-05 at 17:06 -0700, Srinivas Pandruvada wrote:
> System runs at minimum performance, once powercap RAPL package domain
> "enabled" flag is toggled.
>
> Setting RAPL package domain enabled flag to 0, results in setting of
> power limit 4 (PL4) MSR 0x601 to 0.

This is the bug. Setting enabled flag should only affect the Enable bit
but not the other bits.

> This implies disabling PL4 limit.
> The PL4 limit controls the peak power. This can significantly change
> the performance. Even worse, when the enabled flag is set to 1 again.
> This will set PL4 MSR value to 0x01, which means reduce peak power to
> 0.125W. This will force the system to run at the lowest possible
> performance.
>
> This is caused by a change which assumes that there is an enable bit
> in the PL4 MSR like other power limits.

MSR RAPL doesn't have PL4 enable bit, but TPMI RAPL does have per power
limit enable bit.

>
> In functions set_floor_freq_default() and rapl_remove_package(), call
> rapl_write_pl_data with PL_ENABLE and PL_CLAMP for only power limit 1
> and 2. Similarly don't read PL_ENABLE for PL4 to check the presence
> of
> power limit 4.

IMO, the rootcause is that we expose an non-exits PL4_ENABLE primitive
for MSR Interface, and even worse we expose it wrongly that the
primitive uses the power limit bits.

Commit 9050a9cd5e4c ("powercap: intel_rapl: Cleanup Power Limits
support") pokes the MSR interface PL4_ENABLE primitive and exposes this
bug.

Sumeet is testing the below fix right now, and I suppose he will give
some update soon.

thanks,
rui

diff --git a/drivers/powercap/intel_rapl_common.c b/drivers/powercap/intel_rapl_common.c
index 8fac57b28f8a..d407f918876f 100644
--- a/drivers/powercap/intel_rapl_common.c
+++ b/drivers/powercap/intel_rapl_common.c
@@ -658,8 +658,6 @@ static struct rapl_primitive_info rpi_msr[NR_RAPL_PRIMITIVES] = {
RAPL_DOMAIN_REG_LIMIT, ARBITRARY_UNIT, 0),
[PL2_CLAMP] = PRIMITIVE_INFO_INIT(PL2_CLAMP, POWER_LIMIT2_CLAMP, 48,
RAPL_DOMAIN_REG_LIMIT, ARBITRARY_UNIT, 0),
- [PL4_ENABLE] = PRIMITIVE_INFO_INIT(PL4_ENABLE, POWER_LIMIT4_MASK, 0,
- RAPL_DOMAIN_REG_PL4, ARBITRARY_UNIT, 0),
[TIME_WINDOW1] = PRIMITIVE_INFO_INIT(TIME_WINDOW1, TIME_WINDOW1_MASK, 17,
RAPL_DOMAIN_REG_LIMIT, TIME_UNIT, 0),
[TIME_WINDOW2] = PRIMITIVE_INFO_INIT(TIME_WINDOW2, TIME_WINDOW2_MASK, 49,