Re: [PATCH] cpufreq: conservative: Fix incorrect frequency decrease due to stale target

From: Zhongqiu Han

Date: Fri May 08 2026 - 02:48:03 EST


On 5/7/2026 5:33 PM, Viresh Kumar wrote:
On 23-04-26, 15:12, zhenglifeng (A) wrote:
Yes, I think you are right. The behaviors are not the same. I modified this
just in order to keep it consistent with the case exceeding down_threshold.
I'm not sure if this change of behavior is reasonable. Perhaps Rafael or
Viresh could give us some advice.

diff --git a/drivers/cpufreq/cpufreq_conservative.c b/drivers/cpufreq/cpufreq_conservative.c
@@ -104,7 +104,7 @@ static unsigned int cs_dbs_update(struct cpufreq_policy *policy)
          dbs_info->down_skip = 0;
            /* if we are already at full speed then break out early */
-        if (requested_freq == policy->max)
+        if (dbs_info->requested_freq == policy->max)
              goto out;

-        if (requested_freq == policy->min)
+        if (dbs_info->requested_freq == policy->min)

What about dropping these `if` blocks completely ? i.e. always call
__cpufreq_driver_target().

__cpufreq_driver_target() already have similar checks in place to optimize
unnecessary freq changes. We don't really need callers to do the same.


Thanks Viresh a lot,

Nice suggestion! — I think it aligns well with the general direction of
addressing this issue and relying more on the cpufreq core.

That said, I have a concern if we drop the decrease-path early-exit
check while keeping the existing condition:

if (requested_freq > freq_step)

unchanged. In that case, when the system is already at policy->min, this
appears to introduce a governor-level oscillation, even though the
effective hardware frequency remains unchanged:

Consider the following scenario
(policy->min = 200, freq_step = 100, hardware already at min, low load):

Iteration N:
requested_freq = 200
decrease path: 200 > 100 -> requested_freq = 100
__cpufreq_driver_target(100, LE):
clamp -> 200, target == cur
-> without CPUFREQ_NEED_UPDATE_LIMITS: driver not invoked
-> with CPUFREQ_NEED_UPDATE_LIMITS: driver is still invoked
dbs_info->requested_freq = 100 <- below min

Iteration N+1:
requested_freq = 100
out-of-range check: reset to policy->cur = 200
decrease path: 200 > 100 -> requested_freq = 100
dbs_info->requested_freq = 100 <- below min again

This results in a repeated oscillation pattern. For drivers with
CPUFREQ_NEED_UPDATE_LIMITS (amd-pstate, intel_cpufreq, cppc_cpufreq),
the driver may be invoked every sampling period even though the
effective frequency does not change. I'm happy to defer to your judgment
on whether this is significant enough to warrant further changes.

Given this, may i know would it make sense to adjust only the decrease
path early-exit condition to use dbs_info->requested_freq (i.e. the last
target actually requested from hardware), rather than the idle-adjusted
local requested_freq?

- if (requested_freq == policy->min)
+ if (dbs_info->requested_freq == policy->min)
goto out;

With this change, the scenario raised by Lifeng
(dbs_info->requested_freq = 400, idle_periods = 2, low load) behaves as
follows:

Iteration 1:
requested_freq = 400
idle decay: requested_freq = 200
if (dbs_info->requested_freq(400) == min(200)) ? NO -> continue
200 > 100 -> requested_freq = 100
__cpufreq_driver_target(100, LE):
clamp -> 200, target(200) != cur(400)
-> driver invoked, hardware: 400 -> 200 MHz [bug fixed]
dbs_info->requested_freq = 100 <- one-time transient value

Iteration 2:
requested_freq = 100
out-of-range check: reset to policy->cur = 200
if (dbs_info->requested_freq(200) == min(200)) ? YES -> goto out

-> Stable from iteration 2 onward, with no extra driver calls.

There is a one-time transient where dbs_info->requested_freq briefly
drops below policy->min, but this is harmless: the existing out-of-range
check corrects it in the next iteration. A similar situation was
discussed before [1], where the conclusion at the time was that
__cpufreq_driver_target() already performs the necessary clamping. If it
would be clearer or more robust to add an explicit guard
here, this can be revisited as well at your discretion.😊

[1]https://lore.kernel.org/linux-pm/20231005105746.ikezg2buza2qwvig@vireshk-i7/

The increase-path early-exit is intentionally left unchanged, to avoid
altering the behavior in the scenario where conservative idle decay
would otherwise be ignored when the previous requested frequency was
already at policy->max.


--
Thx and BRs,
Zhongqiu Han