Re: [PATCH 4/4] energy_model: use a fixed reference frequency

From: Vincent Guittot
Date: Tue Sep 05 2023 - 12:19:38 EST


On Tue, 5 Sept 2023 at 12:05, Pierre Gondois <pierre.gondois@xxxxxxx> wrote:
>
> Hello Vincent,
> I tried the patch-set on a platform using cppc_cpufreq and that has boosting
> frequencies,
>
> 1-
> On such platform, the CPU capacity comes from the CPPC highest_frequency
> field. The CPU capacity is set to the capacity of the boosting frequency.
> This behaviour is different from DT platforms where the CPU capacity is
> updated whenever the boosting mode is enabled (it seems).

ok, I haven't noticed that cppc_cpufreq would be impacted by this
change in arch_topology. I'm going to check how to fix that

>
> Wouldn't it be better to have CPU max capacities set to their boosting
> capacity as for CPPC base platforms ? It seems the max frequency is always
> available somehow for all the cpufreq drivers with boosting available, i.e.
> acpi-cpufreq, amd-pstate, cppc_cpufreq.

Some platforms will never enable boost or boost is only temporarily
available before being capped. As a result some prefer to use a more
sustainable freq for their max capacity. That's why we can't always
use the max/boost freq

>
>
> 2-
> On the CPPC based platforms, the per_cpu freq_factor is not used/updated,
> meaning that we have:
> arch_scale_freq_ref_em()
> \-arch_scale_freq_ref()
> \-topology_get_freq_ref()
> \-per_cpu(freq_factor, cpu) (set to the default value: 1)
> and em_cpu_energy()'s ref_freq variable is then set to 1 instead of the max
> frequency (leading to a 0 energy computation).

IIUC, cppc uses the default cpu capacity of arch_topology and then
never updates it and it creates an EM for this SMP system.
ok, so you have an EM sets with ACPI and SMP.

I'm going to check where we could set this reference frequency for your case.

>
> 3-
> Also just in case, arch_scale_freq_ref_policy() and cpufreq_get_hw_max_freq()
> seem to have close (but not identical) purpose,
>
> Regards,
> Pierre
>
> On 9/1/23 15:03, Vincent Guittot wrote:
> > The last item of a performance domain is not always the performance point
> > that has been used to compute CPU's capacity. This can lead to different
> > target frequency compared with other part of the system like schedutil and
> > would result in wrong energy estimation.
> >
> > a new arch_scale_freq_ref() is available to return a fixed and coherent
> > frequency reference that can be used when computing the CPU's frequency
> > for an level of utilization. Use this function when available or fallback
> > to the last performance domain item otherwise.
> >
> > Signed-off-by: Vincent Guittot <vincent.guittot@xxxxxxxxxx>
> > ---
> > include/linux/energy_model.h | 20 +++++++++++++++++---
> > 1 file changed, 17 insertions(+), 3 deletions(-)
> >
> > diff --git a/include/linux/energy_model.h b/include/linux/energy_model.h
> > index b9caa01dfac4..7ee07be6928e 100644
> > --- a/include/linux/energy_model.h
> > +++ b/include/linux/energy_model.h
> > @@ -204,6 +204,20 @@ struct em_perf_state *em_pd_get_efficient_state(struct em_perf_domain *pd,
> > return ps;
> > }
> >
> > +#ifdef arch_scale_freq_ref
> > +static __always_inline
> > +unsigned long arch_scale_freq_ref_em(int cpu, struct em_perf_domain *pd)
> > +{
> > + return arch_scale_freq_ref(cpu);
> > +}
> > +#else
> > +static __always_inline
> > +unsigned long arch_scale_freq_ref_em(int cpu, struct em_perf_domain *pd)
> > +{
> > + return pd->table[pd->nr_perf_states - 1].frequency;
> > +}
> > +#endif
> > +
> > /**
> > * em_cpu_energy() - Estimates the energy consumed by the CPUs of a
> > * performance domain
> > @@ -224,7 +238,7 @@ static inline unsigned long em_cpu_energy(struct em_perf_domain *pd,
> > unsigned long max_util, unsigned long sum_util,
> > unsigned long allowed_cpu_cap)
> > {
> > - unsigned long freq, scale_cpu;
> > + unsigned long freq, ref_freq, scale_cpu;
> > struct em_perf_state *ps;
> > int cpu;
> >
> > @@ -241,11 +255,11 @@ static inline unsigned long em_cpu_energy(struct em_perf_domain *pd,
> > */
> > cpu = cpumask_first(to_cpumask(pd->cpus));
> > scale_cpu = arch_scale_cpu_capacity(cpu);
> > - ps = &pd->table[pd->nr_perf_states - 1];
> > + ref_freq = arch_scale_freq_ref_em(cpu, pd);
> >
> > max_util = map_util_perf(max_util);
> > max_util = min(max_util, allowed_cpu_cap);
> > - freq = map_util_freq(max_util, ps->frequency, scale_cpu);
> > + freq = map_util_freq(max_util, ref_freq, scale_cpu);
> >
> > /*
> > * Find the lowest performance state of the Energy Model above the