[PATCH v2 2/2] [RFC] CPUFreq: Add support for cpu-perf-dependencies
From: Nicola Mazzucato
Date: Thu Sep 24 2020 - 05:53:11 EST
I am seeking some feedback/comments on the following approach.
Intro:
Info of performance depency for cpus will be beneficial for systems
where f/w description of the CPU performance control domain is different
from the clock domain, e.g. per-CPU control with multiple CPUs sharing
clock, and kernel OSPM s/w components need to take CPU performance
dependency into account.
Essentially these s/w components will have to be provided with
this information from dt and this RFC is presenting a possible way
to do so.
Little details about implementation are given, as this RFC aims to
present the overall approach.
Proposal:
The cpufreq framework currently assumes that a policy covers a group of
CPUs that are controlled together. The energy model and thermal
frameworks assume that the policy cpumask describes performance
dependency relation. This assumption is no longer generally valid, so we
need a way to represent both control and performance relation in cpufreq.
The proposal is to have one cpufreq_policy instance per control domain,
and have a new cpumask 'dependent_cpus' to the policy to represent the
CPU performance dependencies.
The main reason for a new cpumaks is that although 'related_cpus'
could be (or could have been) used for such purpose, its meaning has
changed over time. Initially it was designed specifically for this
purpose[1], but eventually it has changed to online + offline cpus when
sw coordination in use [2,3].
There is also a 'shared_type' field in cpufreq_policy which provides
info about coordination type (NONE, SW_ANY, SW_ALL, HW). Currently it's in
use only for ACPI but I assume it can be used to indicate the coordination
type even out of ACPI itself. Currently there is no use of TYPE_HW.
Provided that the cpufreq driver will populate dependent_cpus and
set shared_type, the s/w components that rely on such description (we
focus on energy-model and cpufreq_cooling for now) will always be provided
with the correct information, when picking the new cpumask.
Proposed changes (at high level)(4):
1) cpufreq: Add new dependent_cpus cpumaks in cpufreq_policy
* New cpumask addition
<snippet>
struct cpufreq_policy {
cpumask_var_t related_cpus; /* Online + Offline CPUs */
cpumask_var_t real_cpus; /* Related and present */
+ /*
+ * CPUs with hardware clk/perf dependencies
+ *
+ * For sw components that rely on h/w info of clk dependencies when hw
+ * coordinates. This cpumask should always reflect the hw dependencies.
+ */
+ cpumask_var_t dependent_cpus; /* all clk-dependent cpus */
+
unsigned int shared_type; /* ACPI: ANY or ALL affected CPUs
</snippet>
* Fallback mechanism for dependent_cpus. With this, s/w components can
always pick dependent_cpus regardless the coordination type.
<snippet>
static int cpufreq_online(unsigned int cpu)
/* related_cpus should at least include policy->cpus. */
cpumask_copy(policy->related_cpus, policy->cpus);
+
+ /* dependent_cpus should differ only when hw coordination is in place */
+ if (policy->shared_type != CPUFREQ_SHARED_TYPE_HW)
+ cpumask_copy(policy->dependent_cpus, policy->cpus);
}
</snippet>
* Add sysfs attribute for dependent_cpus
2) drivers/thermal/cpufreq_cooling: Replace related_cpus with dependent_cpus
3) drivers/firmware/arm_scmi/perf.c: Parse dt for `cpu-performance-dependencies`
* Parse dt for `cpu-performance-dependencies` optional node
* Store internally performance dependencies
* Add api to get depedent_cpus if required
4) drivers/cpufreq/scmi-cpufreq: Register EM device with the proper cpumask
* Check for performance dependencies and get dependent_cpus
* Set policy->shared_type accordingly
* Provide to EM the correct performance dependencies information
<snippet>
static int scmi_cpufreq_init(struct cpufreq_policy *policy)
policy->fast_switch_possible =
handle->perf_ops->fast_switch_possible(handle, cpu_dev);
- em_dev_register_perf_domain(cpu_dev, nr_opp, &em_cb, policy->cpus);
+ /*
+ * EM needs accurate information about clk boundaries, thus provide the
+ * correct cpumask.
+ */
+ if (handle->perf_ops->has_perf_deps(handle))
+ em_dev_register_perf_domain(cpu_dev, nr_opp, &em_cb,
+ policy->dependent_cpus);
+ else
+ em_dev_register_perf_domain(cpu_dev, nr_opp, &em_cb,
+ policy->cpus);
</snippet>
Any other suggestions are welcome.
Thanks
Nicola
[1] 'commit e8628dd06d66 ("[CPUFREQ] expose cpufreq coordination requirements regardless of coordination mechanism")'
[2] 'commit 951fc5f45836 ("cpufreq: Update Documentation for cpus and related_cpus")'
[3] 'commit f4fd3797848a ("acpi-cpufreq: Add new sysfs attribute freqdomain_cpus")'
--
2.27.0