Re: [PATCH v9 4/8] PM: domains: Add get_performance_state() callback

From: Ulf Hansson
Date: Mon Aug 30 2021 - 05:19:59 EST


+ Dmitry Baryshkov, Bjorn Andersson

On Fri, 27 Aug 2021 at 17:50, Dmitry Osipenko <digetx@xxxxxxxxx> wrote:
>
> 27.08.2021 17:23, Ulf Hansson пишет:
> > On Fri, 27 Aug 2021 at 03:37, Dmitry Osipenko <digetx@xxxxxxxxx> wrote:
> >>
> >> Add get_performance_state() callback that retrieves and initializes
> >> performance state of a device attached to a power domain. This removes
> >> inconsistency of the performance state with hardware state.
> >
> > Can you please try to elaborate a bit more on the use case. Users need
> > to know when it makes sense to implement the callback - and so far we
> > tend to document this through detailed commit messages.
> >
> > Moreover, please state that implementing the callback is optional.
>
> Noted
>
> >> Signed-off-by: Dmitry Osipenko <digetx@xxxxxxxxx>
> >> ---
> >> drivers/base/power/domain.c | 32 +++++++++++++++++++++++++++++---
> >> include/linux/pm_domain.h | 2 ++
> >> 2 files changed, 31 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c
> >> index 3a13a942d012..8b828dcdf7f8 100644
> >> --- a/drivers/base/power/domain.c
> >> +++ b/drivers/base/power/domain.c
> >> @@ -2700,15 +2700,41 @@ static int __genpd_dev_pm_attach(struct device *dev, struct device *base_dev,
> >> goto err;
> >> } else if (pstate > 0) {
> >> ret = dev_pm_genpd_set_performance_state(dev, pstate);
> >> - if (ret)
> >> + if (ret) {
> >> + dev_err(dev, "failed to set required performance state for power-domain %s: %d\n",
> >> + pd->name, ret);
> >
> > Moving the dev_err() here, leads to that we won't print an error if
> > of_get_required_opp_performance_state() fails, a few lines above, is
> > that intentional?
>
> Not intentional, I'll add another message.
>
> >> goto err;
> >> + }
> >> dev_gpd_data(dev)->default_pstate = pstate;
> >> }
> >> +
> >> + if (pd->get_performance_state && !dev_gpd_data(dev)->default_pstate) {
> >> + bool dev_suspended = false;
> >> +
> >> + ret = pd->get_performance_state(pd, base_dev, &dev_suspended);
> >> + if (ret < 0) {
> >> + dev_err(dev, "failed to get performance state for power-domain %s: %d\n",
> >> + pd->name, ret);
> >> + goto err;
> >> + }
> >> +
> >> + pstate = ret;
> >> +
> >> + if (dev_suspended) {
> >
> > The dev_suspended thing looks weird.
> >
> > Perhaps it was needed before dev_pm_genpd_set_performance_state()
> > didn't check pm_runtime_disabled()?
>
> There are two possible variants here:
>
> 1. Device is suspended
> 2. Device is active
>
> If device is suspended, then it will be activated on RPM-resume and h/w
> state will require a specific performance state when resumed. Hence only
> the the rpm_pstate should be set, otherwise SoC may start to consume
> extra power if device won't be resumed by a consumer driver and
> performance state is bumped without a real need.
>
> If device is known to be active, then the performance state should be
> updated immediately, otherwise we have inconsistent state with hardware.
>
> For Tegra dev_suspended=true because in general it should be safe to
> assume that hardware is suspended since it's either stopped by the PD
> driver on initial power_on or it's assumed to be disabled by a consumer
> driver during probe. Technically it's possible to check clock and reset
> state of an attached device from the get_performance_state() to find the
> real state of device, but it's not necessary to do so far.

I follow your reasoning above, but I fail to understand your point, sorry.

Your recent patch ("PM: domains: Improve runtime PM performance state
handling"), made dev_pm_genpd_set_performance_state() to call
pm_runtime_suspended(), to check whether it should assign
dev_gpd_data(dev)->rpm_pstate, which postpones the vote until the
device gets runtime resumed - or call genpd_set_performance_state() to
immediately vote for a new performance state.

That updated behaviour of dev_pm_genpd_set_performance_state should be
sufficient, I think.

In other words, please drop the "dev_suspended" parameter from the
->get_performance_state() callback, as it doesn't make sense to me.

>
> I'll add comment to the code.
>
> >> + dev_gpd_data(dev)->rpm_pstate = pstate;
> >> + } else if (pstate > 0) {
> >> + ret = dev_pm_genpd_set_performance_state(dev, pstate);
> >> + if (ret) {
> >> + dev_err(dev, "failed to set required performance state for power-domain %s: %d\n",
> >> + pd->name, ret);
> >> + goto err;
> >> + }
> >> + }
> >> + }
> >
> > Overall, what we seem to be doing here, is to retrieve a value for an
> > initial/default performance state for a device and then we want to set
> > it to make sure the vote becomes aggregated and finally set for the
> > genpd.
> >
> > With your suggested change, there are now two ways to get the
> > initial/default state. One is through the existing
> > of_get_required_opp_performance_state() and the other is by using a
> > new genpd callback.
> >
> > That said, perhaps we would get a bit cleaner code by moving the "get
> > initial/default performance state" thingy, into a separate function
> > and then call it from here. If this function returns a valid
> > performance state, then we should continue to set the state, by
> > calling dev_pm_genpd_set_performance_state() and update
> > dev_gpd_data(dev)->default_pstate accordingly.
> >
> > Would that work, do you think?
>
> To be honest, I'm now confused by
> of_get_required_opp_performance_state(). It assumes that device is
> active all the time while attached and that device is stopped on detach.
>
> If hardware is always-on, then it should be wrong to drop the
> performance state on detach.
>
> If hardware isn't always-on, then it might be suspended during
> attachment, and thus, only the rpm_pstate should be set. It's also not
> guaranteed that consumer driver will suspend device on unbind, leaving
> it active on detach, thus it should be wrong to drop performance state
> on detach.

I assume the new behaviour in dev_pm_genpd_set_performance_state()
should address most of your concerns above, no?

When it comes to the detaching, the best we can do is to drop the
performance state vote for the device, no matter if it's an always on
HW or not. Simply because after a detach, genpd loses track of the
device, which means it can't account for performance states votes for
it anyway.

>
> Hence I think the default_pstate is a bit out of touch. If this
> attach/detach behaviour is specific to QCOM driver/hardware, then maybe
> of_get_required_opp_performance_state() should be moved out to a
> get_performance_state() of the QCOM PD driver?

That may work, but I hope it's unnecessary.

Overall, the important part is the updated path in
dev_pm_genpd_set_performance_state() where we now call
pm_runtime_suspended(). I am pretty sure this should work fine for
Qcom platforms/drivers too, but let's see if Rajendra, Dmitry or Bjorn
have some concerns about this.

>
> I added Rajendra Nayak to explain.
>
> For now we're bailing out if default_pstate is set because it conflicts
> with get_performance_state().
>
> But we can factor out the code into a separate function anyways to make
> it cleaner a tad.

Yes, please.

[...]

Kind regards
Uffe