Re: [PATCH 1/2] PM-runtime: Fix __pm_runtime_set_status() race with runtime resume

From: Rafael J. Wysocki
Date: Tue Feb 12 2019 - 14:05:35 EST


On Tue, Feb 12, 2019 at 7:28 PM Ulf Hansson <ulf.hansson@xxxxxxxxxx> wrote:
>
> On Tue, 12 Feb 2019 at 17:28, Rafael J. Wysocki <rafael@xxxxxxxxxx> wrote:
> >
> > On Tue, Feb 12, 2019 at 5:18 PM Ulf Hansson <ulf.hansson@xxxxxxxxxx> wrote:
> > >
> > > On Tue, 12 Feb 2019 at 13:10, Rafael J. Wysocki <rjw@xxxxxxxxxxxxx> wrote:
> > > >
> > > > From: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx>
> > > >
> > > > Commit 4080ab083000 ("PM-runtime: Take suppliers into account in
> > > > __pm_runtime_set_status()") introduced a race condition that may
> > > > trigger if __pm_runtime_set_status() is used incorrectly (that is,
> > > > if it is called when PM-runtime is enabled for the target device
> > > > and working).
> > > >
> > > > In that case, if the original PM-runtime status of the device is
> > > > RPM_SUSPENDED, a runtime resume of the device may occur after
> > > > __pm_runtime_set_status() has dropped its power.lock spinlock
> > > > and before deactivating its suppliers, so the suppliers may be
> > > > deactivated while the device is PM-runtime-active which may lead
> > > > to functional issues.
> > > >
> > > > To avoid that, modify __pm_runtime_set_status() to check whether
> > > > or not PM-runtime is enabled for the device before activating its
> > > > suppliers (if the new status is RPM_ACTIVE) and either return an
> > > > error if that's the case or increment the device's disable_depth
> > > > counter to prevent PM-runtime from being enabled for it while
> > > > the remaining part of the function is running (disable_depth is
> > > > then decremented on the way out).
> > > >
> > > > Fixes: 4080ab083000 ("PM-runtime: Take suppliers into account in __pm_runtime_set_status()")
> > > > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx>
> > > > ---
> > > > drivers/base/power/runtime.c | 24 ++++++++++++++++++------
> > > > 1 file changed, 18 insertions(+), 6 deletions(-)
> > > >
> > > > Index: linux-pm/drivers/base/power/runtime.c
> > > > ===================================================================
> > > > --- linux-pm.orig/drivers/base/power/runtime.c
> > > > +++ linux-pm/drivers/base/power/runtime.c
> > > > @@ -1129,6 +1129,22 @@ int __pm_runtime_set_status(struct devic
> > > > if (status != RPM_ACTIVE && status != RPM_SUSPENDED)
> > > > return -EINVAL;
> > > >
> > > > + spin_lock_irq(&dev->power.lock);
> > > > +
> > > > + /*
> > > > + * Prevent PM-runtime from being enabled for the device or return an
> > > > + * error if it is enabled already and working.
> > > > + */
> > > > + if (dev->power.runtime_error || dev->power.disable_depth)
> > > > + dev->power.disable_depth++;
> > > > + else
> > > > + error = -EAGAIN;
> > > > +
> > > > + spin_unlock_irq(&dev->power.lock);
> > > > +
> > > > + if (error)
> > > > + return error;
> > > > +
> > > > /*
> > > > * If the new status is RPM_ACTIVE, the suppliers can be activated
> > > > * upfront regardless of the current status, because next time
> > > > @@ -1147,12 +1163,6 @@ int __pm_runtime_set_status(struct devic
> > > >
> > > > spin_lock_irq(&dev->power.lock);
> > > >
> > > > - if (!dev->power.runtime_error && !dev->power.disable_depth) {
> > > > - status = dev->power.runtime_status;
> > > > - error = -EAGAIN;
> > > > - goto out;
> > > > - }
> > > > -
> > > > if (dev->power.runtime_status == status || !parent)
> > > > goto out_set;
> > > >
> > > > @@ -1205,6 +1215,8 @@ int __pm_runtime_set_status(struct devic
> > > > device_links_read_unlock(idx);
> > > > }
> > > >
> > > > + pm_runtime_enable(dev);
> > >
> > > pm_runtime_enable() uses spin_lock_irqsave(), rather than
> > > spin_lock_irq() - is there a reason to why you want to allow that
> > > here, but not earlier in the function?
> >
> > Device links locking cannot be used in interrupt context.
>
> I get that, but thanks for clarifying.
>
> However, then why did you convert __pm_runtime_set_status() from using
> spin_lock_irqsave() to spin_lock_irq() in commit "PM-runtime: Take
> suppliers into account in __pm_runtime_set_status()"?
>
> As far as I can tell, the spin_lock is not being held while we operate
> on the device links anyway. Or is this just to give the caller an
> indication what kind of context the function can be called from?

The latter plus getting rid of saving/restoring the flags which isn't necessary.