Re: [PATCH] sched/schedutil: Don't set next_freq to UINT_MAX

From: Rafael J. Wysocki
Date: Tue May 08 2018 - 16:54:39 EST


On Tue, May 8, 2018 at 8:42 AM, Viresh Kumar <viresh.kumar@xxxxxxxxxx> wrote:
> The schedutil driver sets sg_policy->next_freq to UINT_MAX on certain
> occasions:
> - In sugov_start(), when the schedutil governor is started for a group
> of CPUs.
> - And whenever we need to force a freq update before rate-limit
> duration, which happens when:
> - there is an update in cpufreq policy limits.
> - Or when the utilization of DL scheduling class increases.
>
> In return, get_next_freq() doesn't return a cached next_freq value but
> instead recalculates the next frequency. This has some side effects
> though and may significantly delay a required increase in frequency.
>
> In sugov_update_single() we try to avoid decreasing frequency if the CPU
> has not been idle recently. Consider this scenario, the available range
> of frequencies for a CPU are from 800 MHz to 2.5 GHz and current
> frequency is 800 MHz. From one of the call paths
> sg_policy->need_freq_update is set to true and hence
> sg_policy->next_freq is set to UINT_MAX. Now if the CPU had been busy,
> next_f will always be less than UINT_MAX, whatever the value of next_f
> is. And so even when we wanted to increase the frequency, we will
> overwrite next_f with UINT_MAX and will not change the frequency
> eventually. This will continue until the time CPU stays busy. This isn't
> cross checked with any specific test cases, but rather based on general
> code review.
>
> Fix that by not resetting the sg_policy->need_freq_update flag from
> sugov_should_update_freq() but get_next_freq() and we wouldn't need to
> overwrite sg_policy->next_freq anymore.
>
> Cc: 4.12+ <stable@xxxxxxxxxxxxxxx> # 4.12+
> Fixes: b7eaf1aab9f8 ("cpufreq: schedutil: Avoid reducing frequency of busy CPUs prematurely")

The rest of the chantelog is totally disconnected from this commit.

So the problem is that sugov_update_single() doesn't check the special
UNIT_MAX value before assigning sg_policy->next_freq to next_f. Fair
enough.

I don't see why the patch is the right fix for that, however.

Thanks,
Rafael