Re: [PATCH 0/2] cpufreq: governor: Apply limits with target_freq instead of policy->cur

From: Rafael J. Wysocki

Date: Mon Mar 09 2026 - 08:45:00 EST


On Mon, Mar 9, 2026 at 9:27 AM Viresh Kumar <viresh.kumar@xxxxxxxxxx> wrote:
>
> On 10-02-26, 19:54, Lifeng Zheng wrote:
> > The motivation for this patchset cames from a test on our platform:
> >
> > With conservative governor and some pressure on CPU, the frequency rapidly
> > reach the max supported frequency, such as 2GHz.
> >
> > Later, some frequency division strategies on our platform were triggered
> > and the actual frequency become 500MHz -- 1/4 of the OS distribution
> > frequency.
> >
> > At that time, if someone excecutes 'cat cpuinfo_cur_freq', the actual
> > frequency will become 250MHz -- 1/4 of the min supported frequency.
> >
> > After the platform recovering from the frequency division, the frequency
> > will stay on 1GHz, until the pressure disappear.
> >
> > The reason this happens is that in cpufreq_verify_current_freq(), if
> > policy->cur != new_freq, policy->update will be queued, which will
> > ultimately lead to a call to cpufreq_policy_apply_limits(), and update the
> > target frequency to policy->min. And then in cs_dbs_update(), since the
> > pressure never vanish, it will always hit the following branches:
> >
> > if (load > dbs_data->up_threshold) {
> > dbs_info->down_skip = 0;
> >
> > /* if we are already at full speed then break out early */
> > if (requested_freq == policy->max)
> > goto out;
> >
> > Therefore, the target frequency will always remain at the lowest frequency.
> >
> > The branching conditions in cs_dbs_update() may not be strict enough, but
> > the root cause of this problem is that the target frequency was updated
> > when querying cpuinfo_cur_freq. For ondemand and schedutil governor,
> > although the frequency will not always remain at the lowest level without
> > rising, will still be min_freq in a short period of time when the query
> > action occurs.
> >
> > Using the freq requested by the governor to decide whether to update the
> > target frequency is more reasonable in cpufreq_policy_apply_limits().
>
> I think I understand the problem now. We are tracking the current
> frequency state via two cached values, policy->cur and requested_freq
> and a mismatch (because of your hardware specific quirks/features)
> between them is making things tricky.
>
> Rafael, will this break anything we can think about ?

I can't recall, but the new code is simpler, so unless anyone has a
particular heartburn with it, go for it.

> diff --git a/drivers/cpufreq/cpufreq_conservative.c b/drivers/cpufreq/cpufreq_conservative.c
> index e0e847764511..c69577e4f941 100644
> --- a/drivers/cpufreq/cpufreq_conservative.c
> +++ b/drivers/cpufreq/cpufreq_conservative.c
> @@ -14,7 +14,6 @@
> struct cs_policy_dbs_info {
> struct policy_dbs_info policy_dbs;
> unsigned int down_skip;
> - unsigned int requested_freq;
> };
>
> static inline struct cs_policy_dbs_info *to_dbs_info(struct policy_dbs_info *policy_dbs)
> @@ -59,10 +58,10 @@ static unsigned int cs_dbs_update(struct cpufreq_policy *policy)
> {
> struct policy_dbs_info *policy_dbs = policy->governor_data;
> struct cs_policy_dbs_info *dbs_info = to_dbs_info(policy_dbs);
> - unsigned int requested_freq = dbs_info->requested_freq;
> struct dbs_data *dbs_data = policy_dbs->dbs_data;
> struct cs_dbs_tuners *cs_tuners = dbs_data->tuners;
> unsigned int load = dbs_update(policy);
> + unsigned int requested_freq = policy->cur;
> unsigned int freq_step;
>
> /*
> @@ -72,16 +71,6 @@ static unsigned int cs_dbs_update(struct cpufreq_policy *policy)
> if (cs_tuners->freq_step == 0)
> goto out;
>
> - /*
> - * If requested_freq is out of range, it is likely that the limits
> - * changed in the meantime, so fall back to current frequency in that
> - * case.
> - */
> - if (requested_freq > policy->max || requested_freq < policy->min) {
> - requested_freq = policy->cur;
> - dbs_info->requested_freq = requested_freq;
> - }
> -
> freq_step = get_freq_step(cs_tuners, policy);
>
> /*
> @@ -113,7 +102,6 @@ static unsigned int cs_dbs_update(struct cpufreq_policy *policy)
>
> __cpufreq_driver_target(policy, requested_freq,
> CPUFREQ_RELATION_HE);
> - dbs_info->requested_freq = requested_freq;
> goto out;
> }
>
> @@ -137,7 +125,6 @@ static unsigned int cs_dbs_update(struct cpufreq_policy *policy)
>
> __cpufreq_driver_target(policy, requested_freq,
> CPUFREQ_RELATION_LE);
> - dbs_info->requested_freq = requested_freq;
> }
>
> out:
> @@ -310,7 +297,6 @@ static void cs_start(struct cpufreq_policy *policy)
> struct cs_policy_dbs_info *dbs_info = to_dbs_info(policy->governor_data);
>
> dbs_info->down_skip = 0;
> - dbs_info->requested_freq = policy->cur;
> }
>
> static struct dbs_governor cs_governor = {
>
> -------------------------8<-------------------------
>
> This always pick the next freq based on policy->cur instead of the
> real last request. The two can differ if:
> - the hardware plays with current frequency, as is the case here.
> - or the limits change and that changes the current frequency (in
> which case we will be at policy->min/max anyway).
>
> --
> viresh