Re: [PATCH 1/1] cpufreq: Set policy->min and max as real QoS constraints
From: Pierre Gondois
Date: Mon May 11 2026 - 09:58:27 EST
Hello Viresh,
On 5/8/26 11:47, Viresh Kumar wrote:
On 23-04-26, 10:47, Pierre Gondois wrote:Ok
cpufreq_set_policy() will ultimately override the policy min/maxI was looking for this series to be divided in 2-3 patches:
values written in the .init() callback through:
cpufreq_policy_online()
\-cpufreq_init_policy()
\-cpufreq_set_policy()
\-/* Set policy->min/max */
Thus the policy min/max values provided are only temporary.
There is an exception if CPUFREQ_NEED_INITIAL_FREQ_CHECK is set and:
cpufreq_policy_online()
\-cpufreq_init_policy()
\-__cpufreq_driver_target()
\-cpufreq_driver->target()
is called. To avoid any regression, set policy->min/max in cpufreq.c
if the values were not initialized.
In this patch:
- Setting policy->min or max value in driver .init() cb is
interpreted as setting a QoS constraint.
- Remove policy->min/max initialization in drivers if the values
are similar to policy->cpuinfo.min_freq/max_freq.
The only drivers where these values are different are:
- gx-suspmod.c
- cppc-cpufreq.c
- longrun.c
- For the cppc-cpufreq driver, the lowest non-linear freq. is
used as a min QoS constraint as suggested at:
https://lore.kernel.org/lkml/20260213100633.15413-1-zhangpengjie2@xxxxxxxxxx/
Signed-off-by: Pierre Gondois <pierre.gondois@xxxxxxx>
---
drivers/cpufreq/amd-pstate.c | 16 ++++++++--------
drivers/cpufreq/cppc_cpufreq.c | 11 +++++++----
drivers/cpufreq/cpufreq-nforce2.c | 4 ++--
drivers/cpufreq/cpufreq.c | 19 +++++++++++++++++--
drivers/cpufreq/freq_table.c | 7 +++----
drivers/cpufreq/gx-suspmod.c | 9 +++++----
drivers/cpufreq/intel_pstate.c | 3 ---
drivers/cpufreq/pcc-cpufreq.c | 8 ++++----
drivers/cpufreq/pxa3xx-cpufreq.c | 4 ++--
drivers/cpufreq/sh-cpufreq.c | 4 ++--
drivers/cpufreq/virtual-cpufreq.c | 5 +----
11 files changed, 51 insertions(+), 39 deletions(-)
- 1st patch would set policy->min/max from cpufreq_policy_online(), if they
aren't already set.
- 2nd patch changes all drivers to not set it anymore, unless QoS thingy.
- Last change talks about QoS and default values.
Ok yes
diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.cThis can come in two lines now instead of three. Similar issue at few more
index 453084c67327f..1ed4bcdcc957f 100644
--- a/drivers/cpufreq/amd-pstate.c
+++ b/drivers/cpufreq/amd-pstate.c
@@ -1090,10 +1090,10 @@ static int amd_pstate_cpu_init(struct cpufreq_policy *policy)
perf = READ_ONCE(cpudata->perf);
- policy->cpuinfo.min_freq = policy->min = perf_to_freq(perf,
- cpudata->nominal_freq,
- perf.lowest_perf);
- policy->cpuinfo.max_freq = policy->max = cpudata->max_freq;
+ policy->cpuinfo.min_freq = perf_to_freq(perf,
+ cpudata->nominal_freq,
+ perf.lowest_perf);
places.
Ok sure
diff --git a/drivers/cpufreq/cppc_cpufreq.c b/drivers/cpufreq/cppc_cpufreq.cI think the use of local variables isn't making it any better.
index 7e7f9dfb7a24c..c6fcecdbbab0c 100644
--- a/drivers/cpufreq/cppc_cpufreq.c
+++ b/drivers/cpufreq/cppc_cpufreq.c
@@ -645,6 +645,7 @@ static int cppc_cpufreq_cpu_init(struct cpufreq_policy *policy)
unsigned int cpu = policy->cpu;
struct cppc_cpudata *cpu_data;
struct cppc_perf_caps *caps;
+ unsigned int min, max;
int ret;
cpu_data = cppc_cpufreq_get_cpu_data(cpu);
@@ -655,13 +656,15 @@ static int cppc_cpufreq_cpu_init(struct cpufreq_policy *policy)
caps = &cpu_data->perf_caps;
policy->driver_data = cpu_data;
+ min = cppc_perf_to_khz(caps, caps->lowest_nonlinear_perf);
+ max = cppc_perf_to_khz(caps, policy->boost_enabled ?
+ caps->highest_perf : caps->nominal_perf);
+
/*
* Set min to lowest nonlinear perf to avoid any efficiency penalty (see
* Section 8.4.7.1.1.5 of ACPI 6.1 spec)
*/
- policy->min = cppc_perf_to_khz(caps, caps->lowest_nonlinear_perf);
- policy->max = cppc_perf_to_khz(caps, policy->boost_enabled ?
- caps->highest_perf : caps->nominal_perf);
+ policy->min = min;
/*
* Set cpuinfo.min_freq to Lowest to make the full range of performance
@@ -669,7 +672,7 @@ static int cppc_cpufreq_cpu_init(struct cpufreq_policy *policy)
* nonlinear perf
*/
policy->cpuinfo.min_freq = cppc_perf_to_khz(caps, caps->lowest_perf);
- policy->cpuinfo.max_freq = policy->max;
+ policy->cpuinfo.max_freq = max;
The above code can be split in 2 functionalities:
diff --git a/drivers/cpufreq/cppc_cpufreq.c b/drivers/cpufreq/cppc_cpufreq.c
index 7e7f9dfb7a24..700abcb3e2e0 100644
--- a/drivers/cpufreq/cppc_cpufreq.c
+++ b/drivers/cpufreq/cppc_cpufreq.c
@@ -660,8 +660,6 @@ static int cppc_cpufreq_cpu_init(struct cpufreq_policy *policy)
* Section 8.4.7.1.1.5 of ACPI 6.1 spec)
*/
policy->min = cppc_perf_to_khz(caps, caps->lowest_nonlinear_perf);
- policy->max = cppc_perf_to_khz(caps, policy->boost_enabled ?
- caps->highest_perf : caps->nominal_perf);
/*
* Set cpuinfo.min_freq to Lowest to make the full range of performance
@@ -669,7 +667,8 @@ static int cppc_cpufreq_cpu_init(struct cpufreq_policy *policy)
* nonlinear perf
*/
policy->cpuinfo.min_freq = cppc_perf_to_khz(caps, caps->lowest_perf);
- policy->cpuinfo.max_freq = policy->max;
+ policy->cpuinfo.max_freq = cppc_perf_to_khz(caps, policy->boost_enabled ?
+ caps->highest_perf : caps->nominal_perf);
policy->transition_delay_us = cppc_cpufreq_get_transition_delay_us(cpu);
policy->shared_type = cpu_data->shared_type;
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.cWhy aren't we doing this right after cpufreq_driver->init() ?
index 44eb1b7e7fc1b..b30bfa3e27daa 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -1453,6 +1453,14 @@ static int cpufreq_policy_online(struct cpufreq_policy *policy,
cpumask_and(policy->cpus, policy->cpus, cpu_online_mask);
if (new_policy) {
+ unsigned int min, max;
+
+ /* Use policy->min/max set by the driver as QoS requests. */
+ min = max(FREQ_QOS_MIN_DEFAULT_VALUE, policy->min);
+ if (policy->max)
+ max = min(FREQ_QOS_MAX_DEFAULT_VALUE, policy->max);
+ else
+ max = FREQ_QOS_MAX_DEFAULT_VALUE;
for_each_cpu(j, policy->related_cpus) {
per_cpu(cpufreq_cpu_data, j) = policy;
add_cpu_dev_symlink(policy, j, get_cpu_device(j));
@@ -1469,18 +1477,25 @@ static int cpufreq_policy_online(struct cpufreq_policy *policy,
ret = freq_qos_add_request(&policy->constraints,
&policy->min_freq_req, FREQ_QOS_MIN,
- FREQ_QOS_MIN_DEFAULT_VALUE);
+ min);
if (ret < 0)
goto out_destroy_policy;
ret = freq_qos_add_request(&policy->constraints,
&policy->max_freq_req, FREQ_QOS_MAX,
- FREQ_QOS_MAX_DEFAULT_VALUE);
+ max);
if (ret < 0)
goto out_destroy_policy;
blocking_notifier_call_chain(&cpufreq_policy_notifier_list,
CPUFREQ_CREATE_POLICY, policy);
+
+ /*
+ * If the driver didn't set QoS constraints, policy->min/max still
+ * need to be set as they are used to clamp frequency requests.
+ */
+ policy->min = policy->min ? policy->min : policy->cpuinfo.min_freq;
+ policy->max = policy->max ? policy->max : policy->cpuinfo.max_freq;
A. Adding QoS constraints
B. Fetching the min/max QoS constraint as initialized by the driver.
I think the main constraints are:
cpufreq_table_validate_and_sort()
\-cpufreq_frequency_table_cpuinfo() [1]
\-policy_has_boost_freq() [2]
[1] policy->cpuinfo.max_freq is updated
[2] Boost support is detected
So the QoS logic relying on policy->boost_supported
or policy->cpuinfo.max_freq (i.e. A. and B.) must be added
after cpufreq_table_validate_and_sort().
---
if (!new_policy && cpufreq_driver->online) {
} else {
cpufreq_driver->init()
cpufreq_table_validate_and_sort()
// [3]
}
The else branch is executed if the policy is not new and there
is no .online() callback. To avoid initializing multiple times
the QoS constraints, A. can be done at [3] but only if
new_policy==true.
---
I should have sent a V2 which does everything in a
cpufreq_policy_init_qos(). Please let me know if this
seems reasonable to you.