Re: [PATCH] PM / sleep: fix use-after-free on async resume

From: Rafael J. Wysocki
Date: Tue Jan 21 2020 - 18:35:15 EST


On Wednesday, January 22, 2020 12:03:16 AM CET Rafael J. Wysocki wrote:
> On Tuesday, January 21, 2020 5:54:58 PM CET Rafael J. Wysocki wrote:
> > On Tue, Jan 21, 2020 at 2:31 AM Chanho Min <chanho.min@xxxxxxx> wrote:
> > >
> > > Some device can be released during suspend (e.g. usb disconnection).
> > > But, Its child device still use dev->parent's lock in dpm_wait().
> > > It can be ocurred use-after-free as bellows. This is happened during
> > > usb resume in practice.
> >
> > In that case the resume of the child is going to be carried out after
> > its parent has gone away, which is generally incorrect..
>
> That isn't really a problem in the case at hand, though, because the memory
> taken up by the parent can only be freed when all of its children have been
> unregistered and all of the class, type, bus, driver etc pointers of the
> children are NULL then, so there won't be a resume callback to execute for
> the child.

Well, not really true, because device_del() doesn't clear dev->bus, for
example, AFAICS, so the resume really needs to be explicitly avoided if
the device has been deleted.

[cut]

> > > --
> >
> > Something a bit more sophisticated is needed here, let me think about that.
> >
>
> I've ended up with the patch below.
>
> The lock prevents the unregistration of dev from completing, if it is acquired
> before device_pm_remove() in device_del(), and that prevents the parent
> reference from being dropped (at the end of the latter) until the lock is held.
> If the lock is acquired after device_pm_remove() has been called for the
> device, there obviously is no need to wait for the parent.
>

So something like this should work:

---
drivers/base/power/main.c | 42 +++++++++++++++++++++++++++++++++++++-----
1 file changed, 37 insertions(+), 5 deletions(-)

Index: linux-pm/drivers/base/power/main.c
===================================================================
--- linux-pm.orig/drivers/base/power/main.c
+++ linux-pm/drivers/base/power/main.c
@@ -273,10 +273,38 @@ static void dpm_wait_for_suppliers(struc
device_links_read_unlock(idx);
}

-static void dpm_wait_for_superior(struct device *dev, bool async)
+static bool dpm_wait_for_superior(struct device *dev, bool async)
{
- dpm_wait(dev->parent, async);
+ struct device *parent;
+
+ /*
+ * If the device and its parent are both resumed asynchronously and the
+ * parent's callback deletes both the device and the parent itself, the
+ * parent object may be freed while this function is running, so avoid
+ * that by reference counting the parent once more unless the device has
+ * been deleted already.
+ */
+ mutex_lock(&dpm_list_mtx);
+
+ if (!device_pm_initialized(dev)) {
+ mutex_unlock(&dpm_list_mtx);
+ return false;
+ }
+
+ parent = get_device(dev->parent);
+
+ mutex_unlock(&dpm_list_mtx);
+
+ dpm_wait(parent, async);
+ put_device(parent);
+
dpm_wait_for_suppliers(dev, async);
+
+ /*
+ * If the parent's callback has deleted the device, it is not correct to
+ * attempt to resume it, so avoid doing that then.
+ */
+ return device_pm_initialized(dev);
}

static void dpm_wait_for_consumers(struct device *dev, bool async)
@@ -621,7 +649,8 @@ static int device_resume_noirq(struct de
if (!dev->power.is_noirq_suspended)
goto Out;

- dpm_wait_for_superior(dev, async);
+ if (!dpm_wait_for_superior(dev, async))
+ goto Out;

skip_resume = dev_pm_may_skip_resume(dev);

@@ -829,7 +858,8 @@ static int device_resume_early(struct de
if (!dev->power.is_late_suspended)
goto Out;

- dpm_wait_for_superior(dev, async);
+ if (!dpm_wait_for_superior(dev, async))
+ goto Out;

callback = dpm_subsys_resume_early_cb(dev, state, &info);

@@ -944,7 +974,9 @@ static int device_resume(struct device *
goto Complete;
}

- dpm_wait_for_superior(dev, async);
+ if (!dpm_wait_for_superior(dev, async))
+ goto Complete;
+
dpm_watchdog_set(&wd, dev);
device_lock(dev);