RE: [PATCH] cpufreq: schedutil: Don't skip freq update when limits change

From: Doug Smythies
Date: Thu Jul 25 2019 - 14:07:41 EST


Hi,

I am having trouble keeping up.
Here is what I have so far:

On 2019.07.24 04:43 Viresh Kumar wrote:
> On 23-07-19, 12:27, Rafael J. Wysocki wrote:
>> On Tue, Jul 23, 2019 at 11:15 AM Viresh Kumar <viresh.kumar@xxxxxxxxxx> wrote:

>>> Though there is one difference between intel_cpufreq and acpi_cpufreq,
>>> intel_cpufreq has fast_switch_possible=true and so it uses slightly
>>> different path in schedutil. I tried to look from that perspective as
>>> well but couldn't find anything wrong.
>>
>> acpi-cpufreq should use fast switching on the Doug's system too.
>
> Ah okay.
>
>>> If you still find intel_cpufreq to be broken, even with this patch,

It is.

>>> please set fast_switch_possible=false instead of true in
>>> __intel_pstate_cpu_init() and try tests again. That shall make it very
>>> much similar to acpi-cpufreq driver.
>>
>> I wonder if this helps.

It does not help.

>> Even so, we want fast switching to be used by intel_cpufreq.

>
> With both using fast switching it shouldn't make any difference.

>> Anyway, it looks like the change reverted by the Doug's patch
>> introduced a race condition that had not been present before. Namely,
>> need_freq_update is cleared in get_next_freq() when it is set _or_
>> when the new freq is different from the cached one, so in the latter
>> case if it happens to be set by sugov_limits() after evaluating
>> sugov_should_update_freq() (which returned 'true' for timing reasons),
>> that update will be lost now. [Previously the update would not be
>> lost, because the clearing of need_freq_update depended only on its
>> current value.] Where it matters is that in the "need_freq_update set"
>> case, the "premature frequency reduction avoidance" should not be
>> applied (as you noticed and hence the $subject patch).
>>
>> However, even with the $subject patch, need_freq_update may still be
>> set by sugov_limits() after the check added by it and then cleared by
>> get_next_freq(), so it doesn't really eliminate the problem.
>>
>> IMO eliminating would require invalidating next_freq this way or
>> another when need_freq_update is set in sugov_should_update_freq(),
>> which was done before commit ecd2884291261e3fddbc7651ee11a20d596bb514.
>
> Hmm, so to avoid locking in fast path we need two variable group to
> protect against this kind of issues. I still don't want to override
> next_freq with a special meaning as it can cause hidden bugs, we have
> seen that earlier.
>
> What about something like this then ?

I tried the patch ("patch2"). It did not fix the issue.

To summarize, all kernel 5.2 based, all intel_cpufreq driver and schedutil governor:

Test: Does a busy system respond to maximum CPU clock frequency reduction?

stock, unaltered: No.
revert ecd2884291261e3fddbc7651ee11a20d596bb514: Yes
viresh patch: No.
fast_switch edit: No.
viresh patch2: No.

References (and procedures used):
https://marc.info/?l=linux-pm&m=156346478429147&w=2
https://marc.info/?l=linux-kernel&m=156343125319461&w=2

... Doug