Re: [PATCH] PM / EM: Inefficient OPPs detection

From: Vincent Donnefort
Date: Thu Apr 15 2021 - 10:35:01 EST


On Thu, Apr 15, 2021 at 01:16:35PM +0000, Quentin Perret wrote:
> On Thursday 08 Apr 2021 at 18:10:29 (+0100), Vincent Donnefort wrote:
> > --- a/kernel/sched/cpufreq_schedutil.c
> > +++ b/kernel/sched/cpufreq_schedutil.c
> > @@ -10,6 +10,7 @@
> >
> > #include "sched.h"
> >
> > +#include <linux/energy_model.h>
> > #include <linux/sched/cpufreq.h>
> > #include <trace/events/power.h>
> >
> > @@ -164,6 +165,9 @@ static unsigned int get_next_freq(struct sugov_policy *sg_policy,
> >
> > freq = map_util_freq(util, freq, max);
> >
> > + /* Avoid inefficient performance states */
> > + freq = em_pd_get_efficient_freq(em_cpu_get(policy->cpu), freq);
>
> I remember this was discussed when Douglas sent his patches some time
> ago, but I still find it sad we index the EM table here but still
> re-index the cpufreq frequency table later :/
>
> Yes in your case this lookup is very inexpensive, but still. EAS relies
> on the EM's table matching cpufreq's accurately, so this second lookup
> still feels rather unnecessary ...

To get only a single lookup, we would need to bring the inefficiency knowledge
directly to the cpufreq framework. But it has its own limitations:

The cpufreq driver can have its own resolve_freq() callback, which means that
not all the drivers would benefit from that feature.

The cpufreq_table can be ordered and accessed in several ways which brings
many combinations that would need to be supported, ending-up with something
much more intrusive. (We can though decide to limit the feature to the low to
high access that schedutil needs).

As the EM needs schedutil to exist anyway, it seemed to be the right place for
this code. It allows any cpufreq driver to benefit from the feature, simplify a
potential extension for a usage by devfreq devices and as a bonus it speeds-up
energy computing, allowing a more complex Energy Model.

>
> > if (freq == sg_policy->cached_raw_freq && !sg_policy->need_freq_update)
> > return sg_policy->next_freq;
> >
> > --
> > 2.7.4
> >