Re: [PATCH v2 0/3] EM / PM: Inefficient OPPs

From: Viresh Kumar
Date: Wed May 26 2021 - 05:38:14 EST

On 26-05-21, 10:01, Vincent Donnefort wrote:
> I originally considered to add the inefficient knowledge into the CPUFreq table.

I wasn't talking about the cpufreq table here in the beginning, but calling
dev_pm_opp_disable(), which will eventually reflect in cpufreq table as well.

> But I then gave up the idea for two reasons:
> * The EM depends on having schedutil enabled. I don't think that any
> other governor would then manage to rely on the inefficient OPPs. (also I
> believe Peter had a plan to keep schedutil as the one and only governor)

Right, that EM is only there for schedutil.

I would encourage if this can be done even without the EM dependency, if
possible. It would be a good thing to do generally for any driver that wants to
do that.

> * The CPUfreq driver doesn't have to rely on the CPUfreq table, if the
> knowledge about inefficient OPPs is into the latter, some drivers might not
> be able to rely on the feature (you might say 'their loss' though :))
> For those reasons, I thought that adding inefficient support into the
> CPUfreq table would complexify a lot the patchset for no functional gain.

What about disabling the OPP in the OPP core itself ? So every user will get the
same picture.

> >
> > Since the whole thing depends on EM and OPPs, I think we can actually do this.
> >
> > When the cpufreq driver registers with the EM core, lets find all the
> > Inefficient OPPs and disable them once and for all. Of course, this must be done
> > on voluntarily basis, a flag from the drivers will do. With this, we won't be
> > required to update any thing at any of the governors end.
> We still need to keep the inefficient OPPs for thermal reason.

How will that benefit us if that OPP is never going to run anyway ? We won't be
cooling down the CPU then, isn't it ?

> But if we go with
> the inefficiency support into the CPUfreq table, we could enable or disable
> them, depending on the thermal pressure. Or add a flag to read the table with or
> without inefficient OPPs?

Yeah, I was looking for a cpufreq driver flag or something like that so OPPs
don't disappear magically for some platforms which don't want it to happen.

Moreover, a cpufreq driver first creates the OPP table, then registers with EM
or thermal. If we can play with that sequence a bit and make sure inefficient
OPPs are disabled before thermal or cpufreq tables are created, we will be good.