Re: [PATCH 0/8] cpufreq: Auto-register with energy model

From: Viresh Kumar
Date: Wed Aug 11 2021 - 01:19:32 EST


On 10-08-21, 13:35, Quentin Perret wrote:
> On Tuesday 10 Aug 2021 at 13:06:47 (+0530), Viresh Kumar wrote:
> > Provide a cpufreq driver flag so drivers can ask the cpufreq core to register
> > with the EM core on their behalf.
>
> Hmm, that's not quite what this does. This asks the cpufreq core to
> use *PM_OPP* to register an EM, which I think is kinda wrong to do from
> there IMO. The decision to use PM_OPP or another mechanism to register
> an EM belongs to platform specific code (drivers), so it is odd for the
> PM_OPP registration to have its own cpufreq flag but not the other ways.
>
> As mentioned in another thread, the very reason to have PM_EM is to not
> depend on PM_OPP, so I'm worried about the direction of travel with this
> series TBH.

I had to use the pm-opp version, since almost everyone was using that.

On the other hand, there isn't a lot of OPP specific stuff in
dev_pm_opp_of_register_em(). It just uses dev_pm_opp_get_opp_count(),
that's all. This ended up in the OPP core, nothing else. Maybe we can
now move it back to the EM core and name it differently ?

> > This allows us to get rid of duplicated code
> > in the drivers and fix the unregistration part as well, which none of the
> > drivers have done until now.
>
> This series adds more code than it removes,

Sadly yes :(

> and the unregistration is
> not a fix as we don't ever remove the EM tables by design, so not sure
> either of these points are valid arguments.

I think that design needs to be looked over again, it looks broken to
me everytime I land onto this code. I wonder why we don't unregister
stuff.

Lets say, I am working on the cpufreq driver and I want to test that
on my ARM machine. Rebooting a simpler board to test stuff out is
easy, but if I am working on an ARM server which is running lots of
other userspace stuff as well, I won't want to reboot the machine just
to test a different versions of the driver. I will rather want to
build the driver as module and insert/remove it again and again.

If the frequency table changes in between versions, this just breaks
as EM won't be updated again.

This breaks one of the most basic rules of Linux Kernel. Inserting a
module should have exactly the same final behavior every single time.
This model doesn't guarantee it. It simply looks broken.

> > This would also make the registration with EM core to happen only after policy
> > is fully initialized, and the EM core can do other stuff from in there, like
> > marking frequencies as inefficient (WIP). Though this patchset is useful without
> > that work being done and should be merged nevertheless.
> >
> > This doesn't update scmi cpufreq driver for now as it is a special case and need
> > to be handled differently. Though we can make it work with this if required.
>
> Note that we'll have more 'special cases' if other architectures start
> using PM_EM, which is what we have been trying to allow since the
> beginning, so that's worth keeping in mind.

Yes, we need to take care of all such special cases as well.

--
viresh