Re: [PATCH 0/2] cpufreq: governor: Apply limits with target_freq instead of policy->cur

From: Viresh Kumar

Date: Mon Mar 09 2026 - 04:28:19 EST


On 10-02-26, 19:54, Lifeng Zheng wrote:
> The motivation for this patchset cames from a test on our platform:
>
> With conservative governor and some pressure on CPU, the frequency rapidly
> reach the max supported frequency, such as 2GHz.
>
> Later, some frequency division strategies on our platform were triggered
> and the actual frequency become 500MHz -- 1/4 of the OS distribution
> frequency.
>
> At that time, if someone excecutes 'cat cpuinfo_cur_freq', the actual
> frequency will become 250MHz -- 1/4 of the min supported frequency.
>
> After the platform recovering from the frequency division, the frequency
> will stay on 1GHz, until the pressure disappear.
>
> The reason this happens is that in cpufreq_verify_current_freq(), if
> policy->cur != new_freq, policy->update will be queued, which will
> ultimately lead to a call to cpufreq_policy_apply_limits(), and update the
> target frequency to policy->min. And then in cs_dbs_update(), since the
> pressure never vanish, it will always hit the following branches:
>
> if (load > dbs_data->up_threshold) {
> dbs_info->down_skip = 0;
>
> /* if we are already at full speed then break out early */
> if (requested_freq == policy->max)
> goto out;
>
> Therefore, the target frequency will always remain at the lowest frequency.
>
> The branching conditions in cs_dbs_update() may not be strict enough, but
> the root cause of this problem is that the target frequency was updated
> when querying cpuinfo_cur_freq. For ondemand and schedutil governor,
> although the frequency will not always remain at the lowest level without
> rising, will still be min_freq in a short period of time when the query
> action occurs.
>
> Using the freq requested by the governor to decide whether to update the
> target frequency is more reasonable in cpufreq_policy_apply_limits().

I think I understand the problem now. We are tracking the current
frequency state via two cached values, policy->cur and requested_freq
and a mismatch (because of your hardware specific quirks/features)
between them is making things tricky.

Rafael, will this break anything we can think about ?

diff --git a/drivers/cpufreq/cpufreq_conservative.c b/drivers/cpufreq/cpufreq_conservative.c
index e0e847764511..c69577e4f941 100644
--- a/drivers/cpufreq/cpufreq_conservative.c
+++ b/drivers/cpufreq/cpufreq_conservative.c
@@ -14,7 +14,6 @@
struct cs_policy_dbs_info {
struct policy_dbs_info policy_dbs;
unsigned int down_skip;
- unsigned int requested_freq;
};

static inline struct cs_policy_dbs_info *to_dbs_info(struct policy_dbs_info *policy_dbs)
@@ -59,10 +58,10 @@ static unsigned int cs_dbs_update(struct cpufreq_policy *policy)
{
struct policy_dbs_info *policy_dbs = policy->governor_data;
struct cs_policy_dbs_info *dbs_info = to_dbs_info(policy_dbs);
- unsigned int requested_freq = dbs_info->requested_freq;
struct dbs_data *dbs_data = policy_dbs->dbs_data;
struct cs_dbs_tuners *cs_tuners = dbs_data->tuners;
unsigned int load = dbs_update(policy);
+ unsigned int requested_freq = policy->cur;
unsigned int freq_step;

/*
@@ -72,16 +71,6 @@ static unsigned int cs_dbs_update(struct cpufreq_policy *policy)
if (cs_tuners->freq_step == 0)
goto out;

- /*
- * If requested_freq is out of range, it is likely that the limits
- * changed in the meantime, so fall back to current frequency in that
- * case.
- */
- if (requested_freq > policy->max || requested_freq < policy->min) {
- requested_freq = policy->cur;
- dbs_info->requested_freq = requested_freq;
- }
-
freq_step = get_freq_step(cs_tuners, policy);

/*
@@ -113,7 +102,6 @@ static unsigned int cs_dbs_update(struct cpufreq_policy *policy)

__cpufreq_driver_target(policy, requested_freq,
CPUFREQ_RELATION_HE);
- dbs_info->requested_freq = requested_freq;
goto out;
}

@@ -137,7 +125,6 @@ static unsigned int cs_dbs_update(struct cpufreq_policy *policy)

__cpufreq_driver_target(policy, requested_freq,
CPUFREQ_RELATION_LE);
- dbs_info->requested_freq = requested_freq;
}

out:
@@ -310,7 +297,6 @@ static void cs_start(struct cpufreq_policy *policy)
struct cs_policy_dbs_info *dbs_info = to_dbs_info(policy->governor_data);

dbs_info->down_skip = 0;
- dbs_info->requested_freq = policy->cur;
}

static struct dbs_governor cs_governor = {

-------------------------8<-------------------------

This always pick the next freq based on policy->cur instead of the
real last request. The two can differ if:
- the hardware plays with current frequency, as is the case here.
- or the limits change and that changes the current frequency (in
which case we will be at policy->min/max anyway).

--
viresh