Re: [PATCH v3 3/3] [RFC] CPUFreq: Add support for cpu-perf-dependencies

From: Lukasz Luba
Date: Fri Nov 06 2020 - 06:14:30 EST




On 11/6/20 10:55 AM, Viresh Kumar wrote:
On 06-11-20, 10:37, Lukasz Luba wrote:
Good question.

How about a different interface for those cpufreq drivers?
That new registration API would allow to specify the cpumask.
Or rely on EM cpumask: em_span_cpus(em)

Currently we have two ways to register cooling device:
1. when the cpufreq driver set a flag CPUFREQ_IS_COOLING_DEV, the core
will register cooling device
2. cpufreq driver can explicitly call the registration function:
cpufreq_cooling_register() with 'policy' as argument

That would need substantial change to the cpufreq cooling code, from
policy oriented to custom driver's cpumask (like EM registration).

I am even wondering if we should really make that change. Why do we
need the combined load of the CPUs to be sent back to the IPA governor
? Why shouldn't they all do that (they == cdev) ?

This is a bit confusing to me, sorry about that. The cpufreq governors
take a look at all the CPUs utilization and set the frequency based on
the highest utilization (and not the total util).

While in this case we present the total load of the CPUs to the IPA
(based on the current frequency of the CPUs), in response to which it
tells us the frequency at which all the CPUs of the policy can run at
(I am not even sure if it is the right thing to do as the CPUs have
different loads). And how do we fit this dependent_cpus thing into
this.

Sorry, I am not sure what's the right way of doing thing here.


I also had similar doubts, because if we make frequency requests
independently for each CPU, why not having N cooling devs, which
will set independently QoS max freq for them...

What convinced me:
EAS and FIE would know the 'real' frequency of the cluster, IPA
can use it also and have only one cooling device per cluster.

We would like to keep this old style 'one cooling device per cpuset'.
I don't have strong opinion and if it would appear that there are
some errors in freq estimation for cluster, then maybe it does make
more sense to have cdev per CPU...