Re: [PATCH] cpufreq: schedutil: Don't ignore limit changes when util is unchanged
From: Sultan Alsawaf
Date: Thu Apr 10 2025 - 12:04:26 EST
On Thu, Apr 10, 2025 at 05:34:39PM +0200, Rafael J. Wysocki wrote:
> On Thu, Apr 10, 2025 at 4:45 AM Sultan Alsawaf <sultan@xxxxxxxxxxxxxxx> wrote:
> >
> > From: Sultan Alsawaf <sultan@xxxxxxxxxxxxxxx>
> >
> > When utilization is unchanged, a policy limits update is ignored unless
> > CPUFREQ_NEED_UPDATE_LIMITS is set. This occurs because limits_changed
> > depends on the old broken behavior of need_freq_update to trigger a call
> > into cpufreq_driver_resolve_freq() to evaluate the changed policy limits.
> >
> > After fixing need_freq_update, limit changes are ignored without
> > CPUFREQ_NEED_UPDATE_LIMITS, at least until utilization changes enough to
> > make map_util_freq() return something different.
> >
> > Fix the ignored limit changes by preserving the value of limits_changed
> > until get_next_freq() is called, so limits_changed can trigger a call to
> > cpufreq_driver_resolve_freq().
> >
> > Reported-and-tested-by: Stephan Gerhold <stephan.gerhold@xxxxxxxxxx>
> > Link: https://lore.kernel.org/lkml/Z_Tlc6Qs-tYpxWYb@xxxxxxxxxx
> > Fixes: 8e461a1cb43d6 ("cpufreq: schedutil: Fix superfluous updates caused by need_freq_update")
> > Signed-off-by: Sultan Alsawaf <sultan@xxxxxxxxxxxxxxx>
> > ---
> > kernel/sched/cpufreq_schedutil.c | 5 +++--
> > 1 file changed, 3 insertions(+), 2 deletions(-)
> >
> > diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
> > index 1a19d69b91ed3..f37b999854d52 100644
> > --- a/kernel/sched/cpufreq_schedutil.c
> > +++ b/kernel/sched/cpufreq_schedutil.c
> > @@ -82,7 +82,6 @@ static bool sugov_should_update_freq(struct sugov_policy *sg_policy, u64 time)
> > return false;
> >
> > if (unlikely(sg_policy->limits_changed)) {
> > - sg_policy->limits_changed = false;
> > sg_policy->need_freq_update = cpufreq_driver_test_flags(CPUFREQ_NEED_UPDATE_LIMITS);
> > return true;
> > }
> > @@ -171,9 +170,11 @@ static unsigned int get_next_freq(struct sugov_policy *sg_policy,
> > freq = get_capacity_ref_freq(policy);
> > freq = map_util_freq(util, freq, max);
> >
> > - if (freq == sg_policy->cached_raw_freq && !sg_policy->need_freq_update)
> > + if (freq == sg_policy->cached_raw_freq && !sg_policy->limits_changed &&
> > + !sg_policy->need_freq_update)
> > return sg_policy->next_freq;
> >
> > + sg_policy->limits_changed = false;
>
> AFAICS, after this code modification, a limit change may be missed due
> to a possible race with sugov_limits() which cannot happen if
> sg_policy->limits_changed is only cleared when it is set before
> updating sg_policy->need_freq_update.
I don't think that's the case because sg_policy->limits_changed is cleared
before the new policy limits are evaluated in cpufreq_driver_resolve_freq().
Granted, if we wanted to be really certain of this, we'd need release semantics.
Looking closer at cpufreq.c actually, isn't there already a race on the updated
policy limits (policy->min and policy->max) since they can be updated again
while schedutil reads them via cpufreq_driver_resolve_freq()?
Sultan