Re: [PATCH 1/2] cpufreq: schedutil: Fix superfluous updates caused by need_freq_update

From: Stephan Gerhold
Date: Tue Apr 08 2025 - 05:02:03 EST


Hi,

On Wed, Dec 11, 2024 at 05:57:32PM -0800, Sultan Alsawaf wrote:
> From: "Sultan Alsawaf (unemployed)" <sultan@xxxxxxxxxxxxxxx>
>
> A redundant frequency update is only truly needed when there is a policy
> limits change with a driver that specifies CPUFREQ_NEED_UPDATE_LIMITS.
>
> In spite of that, drivers specifying CPUFREQ_NEED_UPDATE_LIMITS receive a
> frequency update _all the time_, not just for a policy limits change,
> because need_freq_update is never cleared.
>
> Furthermore, ignore_dl_rate_limit()'s usage of need_freq_update also leads
> to a redundant frequency update, regardless of whether or not the driver
> specifies CPUFREQ_NEED_UPDATE_LIMITS, when the next chosen frequency is the
> same as the current one.
>
> Fix the superfluous updates by only honoring CPUFREQ_NEED_UPDATE_LIMITS
> when there's a policy limits change, and clearing need_freq_update when a
> requisite redundant update occurs.
>
> This is neatly achieved by moving up the CPUFREQ_NEED_UPDATE_LIMITS test
> and instead setting need_freq_update to false in sugov_update_next_freq().
>
> Signed-off-by: Sultan Alsawaf (unemployed) <sultan@xxxxxxxxxxxxxxx>
> ---
> kernel/sched/cpufreq_schedutil.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
> index 28c77904ea74..e51d5ce730be 100644
> --- a/kernel/sched/cpufreq_schedutil.c
> +++ b/kernel/sched/cpufreq_schedutil.c
> @@ -83,7 +83,7 @@ static bool sugov_should_update_freq(struct sugov_policy *sg_policy, u64 time)
>
> if (unlikely(sg_policy->limits_changed)) {
> sg_policy->limits_changed = false;
> - sg_policy->need_freq_update = true;
> + sg_policy->need_freq_update = cpufreq_driver_test_flags(CPUFREQ_NEED_UPDATE_LIMITS);
> return true;
> }
>
> @@ -96,7 +96,7 @@ static bool sugov_update_next_freq(struct sugov_policy *sg_policy, u64 time,
> unsigned int next_freq)
> {
> if (sg_policy->need_freq_update)
> - sg_policy->need_freq_update = cpufreq_driver_test_flags(CPUFREQ_NEED_UPDATE_LIMITS);
> + sg_policy->need_freq_update = false;
> else if (sg_policy->next_freq == next_freq)
> return false;
>

This patch breaks cpufreq throttling (e.g. for thermal cooling) for
cpufreq drivers that:

- Have policy->fast_switch_enabled/fast_switch_possible set, but
- Do not have CPUFREQ_NEED_UPDATE_LIMITS flag set

There are several examples for this in the tree (search for
"fast_switch_possible"). Of all those drivers, only intel-pstate and
amd-pstate (sometimes) set CPUFREQ_NEED_UPDATE_LIMITS.

I can reliably reproduce this with scmi-cpufreq on a Qualcomm X1E
laptop:

1. I added some low temperature trip points in the device tree,
together with passive cpufreq cooling.
2. I run a CPU stress test on all CPUs and monitor the temperatures
and CPU frequencies.

When using "performance" governor instead of "schedutil", the CPU
frequencies are being throttled as expected, as soon as the temperature
trip points are reached.

When using "schedutil", the CPU frequencies stay at maximum as long as
the stress test is running. No throttling happens, so the device heats
up far beyond the defined temperature trip points. Throttling is applied
only after stopping the stress test, since this forces schedutil to
re-evaluate the CPU frequency.

Reverting this commit fixes the problem.

Looking at the code, I think the problem is that:
- sg_policy->limits_changed does not result in
sg->policy->need_freq_update without CPUFREQ_NEED_UPDATE_LIMITS
anymore, and
- Without sg->policy->need_freq_update, get_next_freq() skips calling
cpufreq_driver_resolve_freq(), which would normally apply the policy
min/max constraints.

Do we need to set CPUFREQ_NEED_UPDATE_LIMITS for all cpufreq drivers
that set policy->fast_switch_possible? If I'm reading the documentation
comment correctly, that flag is just supposed to enable notifications if
the policy min/max changes, but the resolved target frequency is still
the same. This is not the case here, the target frequency needs to be
throttled, but schedutil isn't applying the new limits.

Any suggestions how to fix this? I'm happy to test patches with my
setup.

Thanks,
Stephan

#regzbot introduced: 8e461a1cb43d69d2fc8a97e61916dce571e6bb31