Re: [PATCH v1 1/5] PM: sleep: Fix runtime PM issue in dpm_resume()
From: Rafael J. Wysocki
Date: Thu Mar 13 2025 - 06:59:06 EST
On Thu, Mar 13, 2025 at 2:50 AM Saravana Kannan <saravanak@xxxxxxxxxx> wrote:
>
> On Tue, Mar 11, 2025 at 3:47 AM Rafael J. Wysocki <rafael@xxxxxxxxxx> wrote:
> >
> > On Wed, Dec 4, 2024 at 1:53 PM Rafael J. Wysocki <rafael@xxxxxxxxxx> wrote:
> > >
> > > Trim CC list.
> > >
> > > On Thu, Nov 14, 2024 at 11:09 PM Saravana Kannan <saravanak@xxxxxxxxxx> wrote:
> > > >
> > > > Some devices might have their is_suspended flag set to false. In these
> > > > cases, dpm_resume() should skip doing anything for those devices.
> > >
> > > Not really. This is particularly untrue for devices with
> > > power.direct_complete set that have power.is_suspended clear.
> > >
> > > > However, runtime PM enable and a few others steps are done before
> > > > checking for this flag. Fix it so that we do things in the right order.
> > >
> > > I don't see the bug this is fixing, but I can see bugs introduced by it.
> >
> > So AFAICS the bug is in the error path when dpm_suspend() fails in
> > which case some devices with direct_complete set may not have been
> > handled by device_suspend(). Since runtime PM has not been disabled
> > for those devices yet, it doesn't make sense to re-enable runtime PM
> > for them (and if they had runtime PM disabled to start with, this will
> > inadvertently enable runtime PM for them).
> >
> > However, two changes are needed to fix this issue:
> > (1) power.is_suspended needs to be set for the devices with
> > direct_complete set in device_suspend().
> > (2) The power.is_suspended check needs to be moved after the
> > power.syscore one in device_resume().
> >
> > The patch below only does (2) which is insufficient and it introduces
> > a functional issue for the direct_complete devices with runtime PM
> > disabled because it will cause runtime PM to remain disabled for them
> > permanently.
> >
> > > I think that you want power.is_suspended to be checked before waiting
> > > for the superiors. Fair enough, since for devices with
> > > power.is_suspended clear, there should be no superiors to wait for, so
> > > the two checks can be done in any order and checking
> > > power.is_suspended first would be less overhead. And that's it
> > > AFAICS.
> > >
> > > > Signed-off-by: Saravana Kannan <saravanak@xxxxxxxxxx>
> > > > ---
> > > > drivers/base/power/main.c | 6 +++---
> > > > 1 file changed, 3 insertions(+), 3 deletions(-)
> > > >
> > > > diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c
> > > > index 4a67e83300e1..86e51b9fefab 100644
> > > > --- a/drivers/base/power/main.c
> > > > +++ b/drivers/base/power/main.c
> > > > @@ -913,6 +913,9 @@ static void device_resume(struct device *dev, pm_message_t state, bool async)
> > > > if (dev->power.syscore)
> > > > goto Complete;
> > > >
> > > > + if (!dev->power.is_suspended)
> > > > + goto Unlock;
> >
> > And this should be "goto Complete" because jumping to Unlock
> > introduces a device locking imbalance.
> >
> > > > +
> > > > if (dev->power.direct_complete) {
> > > > /* Match the pm_runtime_disable() in __device_suspend(). */
> > > > pm_runtime_enable(dev);
> > > > @@ -931,9 +934,6 @@ static void device_resume(struct device *dev, pm_message_t state, bool async)
> > > > */
> > > > dev->power.is_prepared = false;
> > > >
> > > > - if (!dev->power.is_suspended)
> > > > - goto Unlock;
> > > > -
> > > > if (dev->pm_domain) {
> > > > info = "power domain ";
> > > > callback = pm_op(&dev->pm_domain->ops, state);
> > > > --
> >
> > If you want to submit a new version of this patch, please do so by the
> > end of the week or I will send my fix because I want this issue to be
> > addressed in 6.15.
>
> Please do ahead with the fix for this. I'm not too comfortable with
> the direct_complete logic yet.
OK, I will, thank you!