Re: [PATCH v2 6/6] cpufreq/cppc: set the frequency used for capacity computation
From: Pierre Gondois
Date: Wed Oct 11 2023 - 06:28:06 EST
Hello Vincent,
On 10/9/23 12:36, Vincent Guittot wrote:
cppc cpufreq driver can register an artificial energy model. In such case,
it also have to register the frequency that is used to define the CPU
capacity
Signed-off-by: Vincent Guittot <vincent.guittot@xxxxxxxxxx>
---
drivers/cpufreq/cppc_cpufreq.c | 18 ++++++++++++++++++
1 file changed, 18 insertions(+)
diff --git a/drivers/cpufreq/cppc_cpufreq.c b/drivers/cpufreq/cppc_cpufreq.c
index fe08ca419b3d..24c6ba349f01 100644
--- a/drivers/cpufreq/cppc_cpufreq.c
+++ b/drivers/cpufreq/cppc_cpufreq.c
@@ -636,6 +636,21 @@ static int populate_efficiency_class(void)
return 0;
}
+
+static void cppc_cpufreq_set_capacity_ref_freq(struct cpufreq_policy *policy)
+{
+ struct cppc_perf_caps *perf_caps;
+ struct cppc_cpudata *cpu_data;
+ unsigned int ref_freq;
+
+ cpu_data = policy->driver_data;
+ perf_caps = &cpu_data->perf_caps;
+
+ ref_freq = cppc_cpufreq_perf_to_khz(cpu_data, perf_caps->highest_perf);
+
+ per_cpu(capacity_ref_freq, policy->cpu) = ref_freq;
'capacity_ref_freq' seems to be updated only if CONFIG_ENERGY_MODEL is set. However in
[1], get_capacity_ref_freq() relies on 'capacity_ref_freq'. The cpufreq_schedutil governor
should have a valid 'capacity_ref_freq' value set if the CPPC cpufreq driver is used
without energy model I believe.
Also 'capacity_ref_freq' seems to be set only for 'policy->cpu'. I believe it should
be set for the whole perf domain in case this 'policy->cpu' goes offline.
Another thing, related my comment to [1] and to [2], for CPPC the max capacity matches
the boosting frequency. We have:
'non-boosted max capacity' < 'boosted max capacity'.
-
If boosting is not enabled, the CPU utilization can still go above the 'non-boosted max
capacity'. The overutilization of the system seems to be triggered by comparing the CPU
util to the 'boosted max capacity'. So systems might not be detected as overutilized.
For the EAS energy computation, em_cpu_energy() tries to predict the frequency that will
be used. It is currently unknown to the function that the frequency request will be
clamped by __resolve_freq():
get_next_freq()
\-cpufreq_driver_resolve_freq()
\-__resolve_freq()
This means that the energy computation might use boosting frequencies, which are not
available.
Regards,
Pierre
[1]: [PATCH v2 4/6] cpufreq/schedutil: use a fixed reference frequency
[2]: https://lore.kernel.org/lkml/20230905113308.GF28319@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/
+}
+
static void cppc_cpufreq_register_em(struct cpufreq_policy *policy)
{
struct cppc_cpudata *cpu_data;
@@ -643,6 +658,9 @@ static void cppc_cpufreq_register_em(struct cpufreq_policy *policy)
EM_ADV_DATA_CB(cppc_get_cpu_power, cppc_get_cpu_cost);
cpu_data = policy->driver_data;
+
+ cppc_cpufreq_set_capacity_ref_freq(policy);
+
em_dev_register_perf_domain(get_cpu_device(policy->cpu),
get_perf_level_count(policy), &em_cb,
cpu_data->shared_cpu_map, 0);