RE: [PATCH V3 2/2] cpufreq: intel_pstate: Implement ->resolve_freq()

From: Doug Smythies
Date: Sat Aug 03 2019 - 11:00:39 EST


On 2019.08.02 02:28 Rafael J. Wysocki wrote:
> On Friday, August 2, 2019 11:17:55 AM CEST Rafael J. Wysocki wrote:
>> On Fri, Aug 2, 2019 at 7:44 AM Viresh Kumar <viresh.kumar@xxxxxxxxxx> wrote:
>>>
>>> Intel pstate driver exposes min_perf_pct and max_perf_pct sysfs files,
>>> which can be used to force a limit on the min/max P state of the driver.
>>> Though these files eventually control the min/max frequencies that the
>>> CPUs will run at, they don't make a change to policy->min/max values.
>>
>> That's correct.
>>
>>> When the values of these files are changed (in passive mode of the
>>> driver), it leads to calling ->limits() callback of the cpufreq
>>> governors, like schedutil. On a call to it the governors shall
>>> forcefully update the frequency to come within the limits.
>>
>> OK, so the problem is that it is a bug to invoke the governor's ->limits()
>> callback without updating policy->min/max, because that's what
>> "limits" mean to the governors.
>>
>> Fair enough.
>
> AFAICS this can be addressed by adding PM QoS freq limits requests of each CPU to
> intel_pstate in the passive mode such that changing min_perf_pct or max_perf_pct
> will cause these requests to be updated.

All governors for the intel_cpufreq (intel_pstate in passive mode) CPU frequency
scaling driver are broken with respect to this issue, not just the schedutil
governor. My initial escalation had been focused on acpi-cpufreq/schedutil
and intel_cpufreq/schedutil, as they were both broken, and both fixed by my initially
submitted reversion. What can I say, I missed that other intel_cpufreq governors
were also involved.

I tested all of them: conservative ondemand userspace powersave performance schedutil
Note that no other governor uses resolve_freq().

... Doug