Re: [PATCH 1/2] PM-runtime: Fix __pm_runtime_set_status() race with runtime resume

From: Rafael J. Wysocki
Date: Tue Feb 12 2019 - 11:29:01 EST


On Tue, Feb 12, 2019 at 5:18 PM Ulf Hansson <ulf.hansson@xxxxxxxxxx> wrote:
>
> On Tue, 12 Feb 2019 at 13:10, Rafael J. Wysocki <rjw@xxxxxxxxxxxxx> wrote:
> >
> > From: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx>
> >
> > Commit 4080ab083000 ("PM-runtime: Take suppliers into account in
> > __pm_runtime_set_status()") introduced a race condition that may
> > trigger if __pm_runtime_set_status() is used incorrectly (that is,
> > if it is called when PM-runtime is enabled for the target device
> > and working).
> >
> > In that case, if the original PM-runtime status of the device is
> > RPM_SUSPENDED, a runtime resume of the device may occur after
> > __pm_runtime_set_status() has dropped its power.lock spinlock
> > and before deactivating its suppliers, so the suppliers may be
> > deactivated while the device is PM-runtime-active which may lead
> > to functional issues.
> >
> > To avoid that, modify __pm_runtime_set_status() to check whether
> > or not PM-runtime is enabled for the device before activating its
> > suppliers (if the new status is RPM_ACTIVE) and either return an
> > error if that's the case or increment the device's disable_depth
> > counter to prevent PM-runtime from being enabled for it while
> > the remaining part of the function is running (disable_depth is
> > then decremented on the way out).
> >
> > Fixes: 4080ab083000 ("PM-runtime: Take suppliers into account in __pm_runtime_set_status()")
> > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx>
> > ---
> > drivers/base/power/runtime.c | 24 ++++++++++++++++++------
> > 1 file changed, 18 insertions(+), 6 deletions(-)
> >
> > Index: linux-pm/drivers/base/power/runtime.c
> > ===================================================================
> > --- linux-pm.orig/drivers/base/power/runtime.c
> > +++ linux-pm/drivers/base/power/runtime.c
> > @@ -1129,6 +1129,22 @@ int __pm_runtime_set_status(struct devic
> > if (status != RPM_ACTIVE && status != RPM_SUSPENDED)
> > return -EINVAL;
> >
> > + spin_lock_irq(&dev->power.lock);
> > +
> > + /*
> > + * Prevent PM-runtime from being enabled for the device or return an
> > + * error if it is enabled already and working.
> > + */
> > + if (dev->power.runtime_error || dev->power.disable_depth)
> > + dev->power.disable_depth++;
> > + else
> > + error = -EAGAIN;
> > +
> > + spin_unlock_irq(&dev->power.lock);
> > +
> > + if (error)
> > + return error;
> > +
> > /*
> > * If the new status is RPM_ACTIVE, the suppliers can be activated
> > * upfront regardless of the current status, because next time
> > @@ -1147,12 +1163,6 @@ int __pm_runtime_set_status(struct devic
> >
> > spin_lock_irq(&dev->power.lock);
> >
> > - if (!dev->power.runtime_error && !dev->power.disable_depth) {
> > - status = dev->power.runtime_status;
> > - error = -EAGAIN;
> > - goto out;
> > - }
> > -
> > if (dev->power.runtime_status == status || !parent)
> > goto out_set;
> >
> > @@ -1205,6 +1215,8 @@ int __pm_runtime_set_status(struct devic
> > device_links_read_unlock(idx);
> > }
> >
> > + pm_runtime_enable(dev);
>
> pm_runtime_enable() uses spin_lock_irqsave(), rather than
> spin_lock_irq() - is there a reason to why you want to allow that
> here, but not earlier in the function?

Device links locking cannot be used in interrupt context.