Re: [PATCH 1/3] cpufreq: Add a callback to update the min_freq_req from drivers

From: Rafael J. Wysocki
Date: Fri Oct 04 2024 - 14:18:06 EST


On Thu, Oct 3, 2024 at 10:44 AM Dhananjay Ugwekar
<Dhananjay.Ugwekar@xxxxxxx> wrote:
>
> Currently, there is no proper way to update the initial lower frequency
> limit from cpufreq drivers.

Why do you want to do it?

> Only way is to add a new min_freq qos
> request from the driver side, but it leads to the issue explained below.
>
> The QoS infrastructure collates the constraints from multiple
> subsystems and saves them in a plist. The "current value" is defined to
> be the highest value in the plist for min_freq constraint.
>
> The cpufreq core adds a qos_request for min_freq to be 0 and the amd-pstate
> driver today adds qos request for min_freq to be lowest_freq, where
> lowest_freq corresponds to CPPC.lowest_perf.
>
> Eg: Suppose WLOG considering amd-pstate driver, lowest_freq is 400000 KHz,
> lowest_non_linear_freq is 1200000 KHz.
>
> At this point of time, the min_freq QoS plist looks like:
>
> head--> 400000 KHz (registered by amd-pstate) --> 0 KHz (registered by
> cpufreq core)
>
> When a user updates /sys/devices/system/cpu/cpuX/cpufreq/scaling_min_freq,
> it only results in updating the cpufreq-core's node in the plist, where
> say 0 becomes the newly echoed value.
>
> Now, if the user echoes a value 1000000 KHz, to scaling_min_freq, then the
> new list would be
>
> head--> 1000000 KHz (registered by cpufreq core) --> 400000 KHz (registered
> by amd-pstate)
>
> and the new "current value" of the min_freq QoS constraint will be 1000000
> KHz, this is the scenario where it works as expected.
>
> Suppose we change the amd-pstate driver code's min_freq qos constraint
> to lowest_non_linear_freq instead of lowest_freq, then the user will
> never be able to request a value below that, due to the following:
>
> At boot time, the min_freq QoS plist would be
>
> head--> 1200000 KHz (registered by amd-pstate) --> 0 KHz (registered by
> cpufreq core)
>
> When the user echoes a value of 1000000 KHz, to
> /sys/devices/..../scaling_min_freq, then the new list would be
>
> head--> 1200000 KHz (registered by amd-pstate) --> 1000000 KHz (registered
> by cpufreq core)
>
> with the new "current value" of the min_freq QoS remaining 1200000 KHz.

Yes, that's how frequency QoS works.

> Since the current value has not changed, there won't be any notifications
> sent to the subsystems which have added their QoS constraints. In
> particular, the amd-pstate driver will not get the notification, and thus,
> the user's request to lower the scaling_min_freq will be ineffective.

The value written by user space to scaling_min_freq is a vote, not a
request. It may not be physically possible to reduce the frequency
below a certain minimum level that need not be known to the user.

> Hence, it is advisable to have a single source of truth for the min and
> max freq QoS constraints between the cpufreq and the cpufreq drivers.
>
> So add a new callback get_init_min_freq() add in struct cpufreq_driver,
> which allows amd-pstate (or any other cpufreq driver) to override the
> default min_freq value being set in the policy->min_freq_req. Now
> scaling_min_freq can be modified by the user to any value (lower or
> higher than the init value) later on if desired.
>
> Signed-off-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@xxxxxxx>
> ---
> drivers/cpufreq/cpufreq.c | 6 +++++-
> include/linux/cpufreq.h | 6 ++++++
> 2 files changed, 11 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> index f98c9438760c..2923068cf5f4 100644
> --- a/drivers/cpufreq/cpufreq.c
> +++ b/drivers/cpufreq/cpufreq.c
> @@ -1361,6 +1361,7 @@ static int cpufreq_online(unsigned int cpu)
> bool new_policy;
> unsigned long flags;
> unsigned int j;
> + u32 init_min_freq = FREQ_QOS_MIN_DEFAULT_VALUE;
> int ret;
>
> pr_debug("%s: bringing CPU%u online\n", __func__, cpu);
> @@ -1445,9 +1446,12 @@ static int cpufreq_online(unsigned int cpu)
> goto out_destroy_policy;
> }
>
> + if (cpufreq_driver->get_init_min_freq)
> + init_min_freq = cpufreq_driver->get_init_min_freq(policy);
> +
> ret = freq_qos_add_request(&policy->constraints,
> policy->min_freq_req, FREQ_QOS_MIN,
> - FREQ_QOS_MIN_DEFAULT_VALUE);
> + init_min_freq);
> if (ret < 0) {
> /*
> * So we don't call freq_qos_remove_request() for an
> diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
> index e0e19d9c1323..b20488b55f6c 100644
> --- a/include/linux/cpufreq.h
> +++ b/include/linux/cpufreq.h
> @@ -414,6 +414,12 @@ struct cpufreq_driver {
> * policy is properly initialized, but before the governor is started.
> */
> void (*register_em)(struct cpufreq_policy *policy);
> +
> + /*
> + * Set by drivers that want to initialize the policy->min_freq_req with
> + * a value different from the default value (0) in cpufreq core.
> + */
> + int (*get_init_min_freq)(struct cpufreq_policy *policy);
> };
>
> /* flags */
> --
> 2.34.1
>