Re: [PATCH 3/3] PM: domains: Add a ->dev_get_performance_state() callback to genpd

From: Ulf Hansson
Date: Tue Sep 07 2021 - 05:58:45 EST


On Mon, 6 Sept 2021 at 16:35, Dmitry Osipenko <digetx@xxxxxxxxx> wrote:
>
> 06.09.2021 13:53, Ulf Hansson пишет:
> > On Sun, 5 Sept 2021 at 11:11, Dmitry Osipenko <digetx@xxxxxxxxx> wrote:
> >>
> >> 03.09.2021 17:09, Ulf Hansson пишет:
> >>> On Fri, 3 Sept 2021 at 12:06, Dmitry Osipenko <digetx@xxxxxxxxx> wrote:
> >>>>
> >>>> 03.09.2021 11:55, Ulf Hansson пишет:
> >>>>> On Fri, 3 Sept 2021 at 08:00, Dmitry Osipenko <digetx@xxxxxxxxx> wrote:
> >>>>>>
> >>>>>> 02.09.2021 13:16, Ulf Hansson пишет:
> >>>>>>> Hardware may be preprogrammed to a specific performance state, which may
> >>>>>>> not be zero initially during boot. This may lead to that genpd's current
> >>>>>>> performance state becomes inconsistent with the state of the hardware. To
> >>>>>>> deal with this, the driver for a device that is being attached to its
> >>>>>>> genpd, need to request an initial performance state vote, which is
> >>>>>>> typically done by calling some of the OPP APIs while probing.
> >>>>>>>
> >>>>>>> In some cases this would lead to boilerplate code in the drivers. Let's
> >>>>>>> make it possible to avoid this, by adding a new optional callback to genpd
> >>>>>>> and invoke it per device during the attach process. In this way, the genpd
> >>>>>>> provider driver can inform genpd about the initial performance state that
> >>>>>>> is needed for the device.
> >>>>>>>
> >>>>>>> Signed-off-by: Ulf Hansson <ulf.hansson@xxxxxxxxxx>
> >>>>>>> ---
> >>>>>>> drivers/base/power/domain.c | 8 +++++---
> >>>>>>> include/linux/pm_domain.h | 2 ++
> >>>>>>> 2 files changed, 7 insertions(+), 3 deletions(-)
> >>>>>>>
> >>>>>>> diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c
> >>>>>>> index 800adf831cae..1a6f3538af8d 100644
> >>>>>>> --- a/drivers/base/power/domain.c
> >>>>>>> +++ b/drivers/base/power/domain.c
> >>>>>>> @@ -2640,13 +2640,15 @@ static void genpd_dev_pm_sync(struct device *dev)
> >>>>>>> genpd_queue_power_off_work(pd);
> >>>>>>> }
> >>>>>>>
> >>>>>>> -static int genpd_get_default_performance_state(struct device *dev,
> >>>>>>> +static int genpd_get_default_performance_state(struct generic_pm_domain *genpd,
> >>>>>>> + struct device *dev,
> >>>>>>> unsigned int index)
> >>>>>>> {
> >>>>>>> int pstate = of_get_required_opp_performance_state(dev->of_node, index);
> >>>>>>>
> >>>>>>> if (pstate == -ENODEV || pstate == -EOPNOTSUPP)
> >>>>>>> - return 0;
> >>>>>>> + pstate = genpd->dev_get_performance_state ?
> >>>>>>> + genpd->dev_get_performance_state(genpd, dev) : 0;
> >>>>>>>
> >>>>>>> return pstate;
> >>>>>>> }
> >>>>>>> @@ -2701,7 +2703,7 @@ static int __genpd_dev_pm_attach(struct device *dev, struct device *base_dev,
> >>>>>>> }
> >>>>>>>
> >>>>>>> /* Set the default performance state */
> >>>>>>> - pstate = genpd_get_default_performance_state(dev, index);
> >>>>>>> + pstate = genpd_get_default_performance_state(pd, dev, index);
> >>>>>>
> >>>>>> If base device is suspended, then its performance state is zero.
> >>>>>>
> >>>>>> When device will be rpm-resumed, then its performance should be set to
> >>>>>> the default state.
> >>>>>> You're setting performance state of the wrong device, it should be the
> >>>> Are you okay with my variant of handling the suspended device?
> >>>
> >>> Not sure if you intended to post this line?
> >>>
> >>> In any case, I am happy to help and review to move things forward.
> >>
> >> It's not clear to me whether you omitted handling the case of
> >> rpm-suspended device on purpose or not. I think it should be a part of
> >> this patch, but sounds like you want to work on it separately, correct?
> >
> > I didn't omit the handling, but instead relied solely on the
> > pm_runtime_suspended() check in dev_pm_genpd_set_performance_state().
>
> It doesn't work as expected for Tegra because pm_runtime_suspended()
> returns false while RPM is disabled and it's normally disabled at the
> attachment time.

Runtime PM is in most cases (probably all) not enabled for the device
when attaching.

This isn't specific to Tegra, but a common behavior of how it works
during attach.

>
> >>>>>> base device and not the virtual domain device.
> >>>>>
> >>>>> No I am not. :-) Let me elaborate.
> >>>>>
> >>>>> For the single PM domain case, 'dev' and 'base_dev' are pointing to
> >>>>> the same device. So this works fine.
> >>>>>
> >>>>> For the multiple PM domain case or when attaching goes via
> >>>>> genpd_dev_pm_attach_by_id(), 'dev' is the virtual device registered in
> >>>>> genpd_dev_pm_attach_by_id(). In this case, it's 'dev' that is becoming
> >>>>> attached to genpd and not the 'base_dev'. Note also that, runtime PM
> >>>>> has not been enabled for 'dev' yet at this point and 'dev' has been
> >>>>> assigned the same OF node as 'base_dev", to allow OF parsing to work
> >>>>> as is for it.
> >>>>>
> >>>>> Moreover, to deal with runtime PM in the multiple PM domain case, the
> >>>>> consumer driver should create a device link. Along the lines of this:
> >>>>> device_link_add(base_dev, dev, DL_FLAG_PM_RUNTIME |
> >>>>> DL_FLAG_STATELESS), thus assigning the virtual device ('dev') as the
> >>>>> supplier for its consumer device ('base_dev').
> >>>>>
> >>>>>>
> >>>>>> These all is handled properly by my patch [1]. Hence it's complicated
> >>>>>> for the reason.
> >>>>>
> >>>>> See above. It shouldn't have to be complicated. If it still is, there
> >>>>> is something to fix for the multiple PM domain case.
> >>>>>> [1]
> >>>> Alright, it actually works now on Tegra using the dev in the callback
> >>>> for the case of multiple domains, I re-checked it. Previously, when I
> >>>> tried that, there was a conflict in regards to OPP usage, I don't
> >>>> remember details anymore. Maybe the recent changes that were suggested
> >>>> by Viresh helped with that. So yes, there is no need to pass the base
> >>>> device anymore.
> >>>
> >>> Great! So, it seems like $subject patch should be a way forward for you then?
> >>
> >> The current behaviour is incorrect for Tegra because it needs to set the
> >> rpm_pstate for rpm-suspended device, instead of bumping the state
> >> immediately.
> >>
> >> Power management is defeated without it on Tegra because SoC will start
> >> to consume extra power while device that needs this power is suspended.
> >
> > Okay, I understand your concern.
> >
> > For devices that may remain runtime suspended when their consumer
> > drivers probes them, the behaviour may be suboptimal. This because it
> > could lead to having an active performance state vote for a runtime
> > suspended device, at least until it gets runtime resumed and then
> > runtime suspended again.
> >
> > This all boils down to how the consumer driver deploys support for
> > runtime PM - and genpd doesn't know nor can control that.
>
> Previously, I added the 'dev_suspended' argument to the
> dev_get_performance_state() callback to allow PD driver to decide
> whether state should applied immediately or on rpm-resume, but you asked
> to remove it because it didn't make sense to you [1].
>
> [1]
> https://lore.kernel.org/linux-pm/CAPDyKFo=SFpm+uJYH4UDfKWLVnkP2cKkBcbOQeVhU5hRxHUMCw@mail.gmail
>
> Does it make sense now?

Unfortunately, no, it still doesn't. Let me try to elaborate why below.

>
> > I wonder if we perhaps should just leave this as is then. In other
> > words, rely on the consumer driver to vote for an initial performance
> > state of the device during ->probe(). In this way, the consumer driver
> > can decide what is the best thing to do, rather than letting genpd
> > make guesses.
>
> The point of this series is to remove the boilerplate code from consumer
> drivers.
>
> I already implemented variant with the explicit state syncing done by
> consumer drivers, but Viresh suggested that it should be done by the PD
> driver, this is why we're discussing it all over again.
>
> We either need to add quirks to consumer drivers or make PD API more
> flexible. You're not in favor of extending the PD API. To me the variant
> with the PD API extension is a bit nicer since it removes the
> boilerplate code, but I also see why you don't like it.

I don't mind extending the genpd API, but it needs to serve a good purpose.

As I said earlier, genpd doesn't know nor can control how the consumer
driver deploys runtime PM. Unfortunately, that also includes genpd
providers, as the behavior isn't a platform or PM domain specific
thing. This means genpd needs to be generic enough so it works for all
cases.

In the $subject patch, we rely on the pm_runtime_suspended() check in
dev_pm_genpd_set_performance_state(), which should work for all cases,
even if it may be sub-optimal for some scenarios.

Note that, in the approach your suggested [1],
pm_runtime_status_suspended() is used instead. This doesn't work when
a consumer driver doesn't enable runtime PM - or calls
pm_runtime_set_active() during ->probe(), because
genpd_runtime_resume() won't be invoked to restore the gpd->rpm_state.

That said, I wouldn't mind to simply skip adding the
->dev_get_performance_state() all together, if that is what you
prefer? In this way, it becomes the responsibility for the consumer
driver to do right thing, with the cost of some boilerplate code added
in its ->probe() routine.

>
> Viresh, are you okay with going back to the variant with the
> dev_pm_opp_sync() helper?

Rather than trying this again, I would suggest you start by open
coding these parts, for now. But I leave that to Viresh to decide.

[...]

Kind regards
Uffe

[1]
[PATCH v10 4/8] PM: domains: Add dev_get_performance_state() callback