Re: [PATCH v6 7/9] ACPI: CPPC: add APIs and sysfs interface for perf_limited
From: Sumit Gupta
Date: Sat Jan 24 2026 - 16:04:51 EST
On 22/01/26 17:21, Pierre Gondois wrote:
External email: Use caution opening links or attachments
On 1/20/26 15:56, Sumit Gupta wrote:
Add sysfs interface to read/write the Performance Limited register.
The Performance Limited register indicates to the OS that an
unpredictable event (like thermal throttling) has limited processor
performance. It contains two sticky bits set by the platform:
- Bit 0 (Desired_Excursion): Set when delivered performance is
constrained below desired performance. Not used when Autonomous
Selection is enabled.
- Bit 1 (Minimum_Excursion): Set when delivered performance is
constrained below minimum performance.
These bits remain set until OSPM explicitly clears them. The write
operation accepts a bitmask of bits to clear:
- Write 0x1 to clear bit 0
- Write 0x2 to clear bit 1
- Write 0x3 to clear both bits
This enables users to detect if platform throttling impacted a workload.
Users clear the register before execution, run the workload, then check
afterward - if set, hardware throttling occurred during that time window.
The interface is exposed as:
/sys/devices/system/cpu/cpuX/cpufreq/perf_limited
Signed-off-by: Sumit Gupta <sumitg@xxxxxxxxxx>
---
drivers/acpi/cppc_acpi.c | 56 ++++++++++++++++++++++++++++++++++
drivers/cpufreq/cppc_cpufreq.c | 5 +++
include/acpi/cppc_acpi.h | 15 +++++++++
3 files changed, 76 insertions(+)
diff --git a/drivers/acpi/cppc_acpi.c b/drivers/acpi/cppc_acpi.c
index 46bf45f8b0f3..b46f22f58f56 100644
--- a/drivers/acpi/cppc_acpi.c
+++ b/drivers/acpi/cppc_acpi.c
@@ -1787,6 +1787,62 @@ int cppc_set_max_perf(int cpu, u32 max_perf)
}
EXPORT_SYMBOL_GPL(cppc_set_max_perf);
+/**
+ * cppc_get_perf_limited - Get the Performance Limited register value.
+ * @cpu: CPU from which to get Performance Limited register.
+ * @perf_limited: Pointer to store the Performance Limited value.
+ *
+ * The returned value contains sticky status bits indicating platform-imposed
+ * performance limitations.
+ *
+ * Return: 0 for success, -EIO on failure, -EOPNOTSUPP if not supported.
+ */
+int cppc_get_perf_limited(int cpu, u64 *perf_limited)
+{
+ return cppc_get_reg_val(cpu, PERF_LIMITED, perf_limited);
+}
+EXPORT_SYMBOL_GPL(cppc_get_perf_limited);
+
+/**
+ * cppc_set_perf_limited() - Clear bits in the Performance Limited register.
+ * @cpu: CPU on which to write register.
+ * @bits_to_clear: Bitmask of bits to clear in the perf_limited register.
+ *
+ * The Performance Limited register contains two sticky bits set by platform:
+ * - Bit 0 (Desired_Excursion): Set when delivered performance is constrained
+ * below desired performance. Not used when Autonomous Selection is enabled.
+ * - Bit 1 (Minimum_Excursion): Set when delivered performance is constrained
+ * below minimum performance.
+ *
+ * These bits are sticky and remain set until OSPM explicitly clears them.
+ * This function only allows clearing bits (the platform sets them).
+ *
+ * Return: 0 for success, -EINVAL for invalid bits, -EIO on register
+ * access failure, -EOPNOTSUPP if not supported.
+ */
+int cppc_set_perf_limited(int cpu, u64 bits_to_clear)
+{
+ u64 current_val, new_val;
+ int ret;
+
+ /* Only bits 0 and 1 are valid */
+ if (bits_to_clear & ~CPPC_PERF_LIMITED_MASK)
+ return -EINVAL;
+
+ if (!bits_to_clear)
+ return 0;
+
+ ret = cppc_get_perf_limited(cpu, ¤t_val);
+ if (ret)
+ return ret;
+
+ /* Clear the specified bits */
+ new_val = current_val & ~bits_to_clear;
+
+ return cppc_set_reg_val(cpu, PERF_LIMITED, new_val);
+}
+EXPORT_SYMBOL_GPL(cppc_set_perf_limited);
+
/**
* cppc_set_enable - Set to enable CPPC on the processor by writing the
* Continuous Performance Control package EnableRegister field.
diff --git a/drivers/cpufreq/cppc_cpufreq.c b/drivers/cpufreq/cppc_cpufreq.c
index 66e183b45fb0..afb2cdb67a2f 100644
--- a/drivers/cpufreq/cppc_cpufreq.c
+++ b/drivers/cpufreq/cppc_cpufreq.c
@@ -1071,12 +1071,16 @@ static ssize_t store_max_perf(struct cpufreq_policy *policy, const char *buf,
return count;
}
+CPPC_CPUFREQ_ATTR_RW_U64(perf_limited, cppc_get_perf_limited,
+ cppc_set_perf_limited)
+
cpufreq_freq_attr_ro(freqdomain_cpus);
cpufreq_freq_attr_rw(auto_select);
cpufreq_freq_attr_rw(auto_act_window);
cpufreq_freq_attr_rw(energy_performance_preference_val);
cpufreq_freq_attr_rw(min_perf);
cpufreq_freq_attr_rw(max_perf);
+cpufreq_freq_attr_rw(perf_limited);
If the OS wants to get regular feedback about whether the platform had
to limit
the perf. level, it will likely try to frequently probe the register.
In order to see new events, the register must be cleared. So:
- is it a good idea to allow users to write this register ?
- is it useful to expose this register if the OS frequently clears it ?
I think the functions are useful, it might just be questionable to expose
the register in the sysfs.
Currently the kernel doesn't automatically poll or clear perf_limited,
so sysfs exposure is for manual monitoring. I can make it read-only
but then users can only observe throttling events and can't clear
them (though bits stay sticky). So, better to expose as RW attribute.
Thank you,
Sumit Gupta
static struct freq_attr *cppc_cpufreq_attr[] = {
&freqdomain_cpus,
@@ -1085,6 +1089,7 @@ static struct freq_attr *cppc_cpufreq_attr[] = {
&energy_performance_preference_val,
&min_perf,
&max_perf,
+ &perf_limited,
NULL,
};
....