Re: [PATCH] PM: runtime: Allow rpm_resume() to succeed when runtime PM is disabled

From: Ulf Hansson
Date: Fri Nov 05 2021 - 12:04:00 EST


On Mon, 1 Nov 2021 at 15:41, Grygorii Strashko <grygorii.strashko@xxxxxx> wrote:
>
>
>
> On 01/11/2021 11:27, Ulf Hansson wrote:
> > On Fri, 29 Oct 2021 at 20:27, Rafael J. Wysocki <rafael@xxxxxxxxxx> wrote:
> >>
> >> On Fri, Oct 29, 2021 at 12:20 AM Ulf Hansson <ulf.hansson@xxxxxxxxxx> wrote:
> >>>
> >>> On Wed, 27 Oct 2021 at 16:33, Alan Stern <stern@xxxxxxxxxxxxxxxxxxx> wrote:
> >>>>
> >>>> On Wed, Oct 27, 2021 at 12:55:43PM +0200, Ulf Hansson wrote:
> >>>>> On Wed, 27 Oct 2021 at 04:02, Alan Stern <stern@xxxxxxxxxxxxxxxxxxx> wrote:
> >>>>>>
> >>>>>> On Wed, Oct 27, 2021 at 12:26:26AM +0200, Ulf Hansson wrote:
> >>>>>>> During system suspend, the PM core sets dev->power.is_suspended for the
> >>>>>>> device that is being suspended. This flag is also being used in
> >>>>>>> rpm_resume(), to allow it to succeed by returning 1, assuming that runtime
> >>>>>>> PM has been disabled and the runtime PM status is RPM_ACTIVE, for the
> >>>>>>> device.
> >>>>>>>
> >>>>>>> To make this behaviour a bit more useful, let's drop the check for the
> >>>>>>> dev->power.is_suspended flag in rpm_resume(), as it doesn't really need to
> >>>>>>> be limited to this anyway.
> >>>>>>>
> >>>>>>> Signed-off-by: Ulf Hansson <ulf.hansson@xxxxxxxxxx>
> >>>>>>> ---
> >>>>>>> drivers/base/power/runtime.c | 4 ++--
> >>>>>>> 1 file changed, 2 insertions(+), 2 deletions(-)
> >>>>>>>
> >>>>>>> diff --git a/drivers/base/power/runtime.c b/drivers/base/power/runtime.c
> >>>>>>> index ec94049442b9..fadc278e3a66 100644
> >>>>>>> --- a/drivers/base/power/runtime.c
> >>>>>>> +++ b/drivers/base/power/runtime.c
> >>>>>>> @@ -742,8 +742,8 @@ static int rpm_resume(struct device *dev, int rpmflags)
> >>>>>>> repeat:
> >>>>>>> if (dev->power.runtime_error)
> >>>>>>> retval = -EINVAL;
> >>>>>>> - else if (dev->power.disable_depth == 1 && dev->power.is_suspended
> >>>>>>> - && dev->power.runtime_status == RPM_ACTIVE)
> >>>>>>> + else if (dev->power.disable_depth > 0 &&
> >>>>>>> + dev->power.runtime_status == RPM_ACTIVE)
> >>>>>>
> >>>>>> IIRC there was a good reason why the original code checked for
> >>>>>> disable_depth == 1 rather than > 0. But I don't remember exactly what
> >>>>>> the reason was. Maybe it had something to do with the fact that during
> >>>>>> a system sleep __device_suspend_late calls __pm_runtime_disable, and the
> >>>>>> code was checking that there were no other disables in effect.
> >>>>>
> >>>>> The check was introduced in the below commit:
> >>>>>
> >>>>> Commit 6f3c77b040fc
> >>>>> Author: Kevin Hilman <khilman@xxxxxx>
> >>>>> Date: Fri Sep 21 22:47:34 2012 +0000
> >>>>> PM / Runtime: let rpm_resume() succeed if RPM_ACTIVE, even when disabled, v2
> >>>>>
> >>>>> By reading the commit message it's pretty clear to me that the check
> >>>>> was added to cover only one specific use case, during system suspend.
> >>>>>
> >>>>> That is, that a driver may want to call pm_runtime_get_sync() from a
> >>>>> late/noirq callback (when the PM core has disabled runtime PM), to
> >>>>> understand whether the device is still powered on and accessible.
> >>>>>
> >>>>>> This is
> >>>>>> related to the documented behavior of rpm_resume (it's supposed to fail
> >>>>>> with -EACCES if the device is disabled for runtime PM, no matter what
> >>>>>> power state the device is in).
> >>>>>>
> >>>>>> That probably is also the explanation for why dev->power.is_suspended
> >>>>>> gets checked: It's how the code tells whether a system sleep is in
> >>>>>> progress.
> >>>>>
> >>>>> Yes, you are certainly correct about the current behaviour. It's there
> >>>>> for a reason.
> >>>>>
> >>>>> On the other hand I would be greatly surprised if this change would
> >>>>> cause any issues. Of course, I can't make guarantees, but I am, of
> >>>>> course, willing to help to fix problems if those happen.
> >>>>>
> >>>>> As a matter of fact, I think the current behaviour looks quite
> >>>>> inconsistent, as it depends on whether the device is being system
> >>>>> suspended.
> >>>>>
> >>>>> Moreover, for syscore devices (dev->power.syscore is set for them),
> >>>>> the PM core doesn't set the "is_suspended" flag. Those can benefit
> >>>>> from a common behaviour.
> >>>>>
> >>>>> Finally, I think the "is_suspended" flag actually needs to be
> >>>>> protected by a lock when set by the PM core, as it's being used in two
> >>>>> separate execution paths. Although, rather than adding a lock for
> >>>>> protection, we can just rely on the "disable_depth" in rpm_resume().
> >>>>> It would be easier and makes the behaviour consistent too.
> >>>>
> >>>> As long as is_suspended isn't _written_ in two separate execution paths,
> >>>> we're probably okay without a lock -- provided the code doesn't mind
> >>>> getting an indefinite result when a read races with a write.
> >>>
> >>> Well, indefinite doesn't sound very good to me for these cases, even
> >>> if it most likely never will happen.
> >>>
> >>>>
> >>>>>> So overall, I suspect this change should not be made. But some other
> >>>>>> improvement (like a nice comment) might be in order.
> >>>>>>
> >>>>>> Alan Stern
> >>>>>
> >>>>> Thanks for reviewing!
> >>>>
> >>>> You're welcome. Whatever you eventually decide to do should be okay
> >>>> with me. I just wanted to make sure that you understood the deeper
> >>>> issue here and had given it some thought. For example, it may turn out
> >>>> that you can resolve matters simply by updating the documentation.
> >>>
> >>> I observed the issue on cpuidle-psci. The devices it operates upon are
> >>> assigned as syscore devices and these are hooked up to a genpd.
> >>>
> >>> A call to pm_runtime_get_sync() can happen even after the PM core has
> >>> disabled runtime PM in the "late" phase. So the error code is received
> >>> for these real use-cases.
> >>>
> >>> Now, as we currently don't check the return value of
> >>> pm_runtime_get_sync() in cpuidle-psci, it's not a big deal. But it
> >>> certainly seems worth fixing in my opinion.
> >>>
> >>> Let's see if Rafael has some thoughts around this.
> >>
> >> Am I thinking correctly that this is mostly about working around the
> >> limitations of pm_runtime_force_suspend()?
> >
> > No, this isn't related at all.
> >
> > The cpuidle-psci driver doesn't have PM callbacks, thus using
> > pm_runtime_force_suspend() would not work here.
> >
>
> i think reason for (dev->power.disable_depth == 1 && dev->power.is_suspended)
> can be found in [1], as other related comments:
>
> Rafael J. Wysocki:
> >>>
> I've discussed that with Kevin. The problem is that the runtime PM
> status may be changed at will when runtime PM is disabled by using
> __pm_runtime_set_status(), so the status generally cannod be trusted
> if power.disable_depth > 0.
>
> During system suspend, however, runtime PM is disabled by the core and
> if neither the driver nor the subsystem has disabled it in the meantime,
> the status should be actually valid.

I don't quite understand this comment from the past, but I guess it's
also kind of difficult without having the complete context.

In any case, if anyone updates the runtime PM status for a device
through __pm_runtime_set_status(), protection from concurrent accesses
is managed by using the spin lock (dev->power.lock).

> <<<
>
> Hence, this is about using PM runtime for CPU PM and, CPU PM is pretty specific case,
> wouldn't manual check for CPU PM status work for you, like !pm_runtime_status_suspended()?
> (if i'm not mistaken - CPU PM done in atomic context).

No, that doesn't work. If I want to call pm_runtime_status_suspended()
to check the runtime PM status, I would first need to disable runtime
PM.

>
>
> [1] http://lkml.iu.edu/hypermail/linux/kernel/1209.2/03256.html
>
> --
> Best regards,
> grygorii

Kind regards
Uffe