[PATCH v2] PM: runtime: Return properly from rpm_resume() if dev->power.needs_force_resume flag is set

From: Liu Ying
Date: Fri Sep 23 2022 - 08:48:22 EST


After a device transitions to sleep state through it's system suspend
callback pm_runtime_force_suspend(), the device's driver may still try
to do runtime PM for the device(runtime suspend first and then runtime
resume) although runtime PM is disabled by that callback. The runtime
PM operations would not touch the device effectively and the device is
assumed to be resumed through it's system resume callback
pm_runtime_force_resume().

The problem is that since the device's runtime PM status is RPM_SUSPENDED
in the sleep state, rpm_resume() would not take the device as in already
active status and would return -EACCES instead of 1. The error code
-EACCES may make the device's driver put the device into runtime suspend
status with the RPM_GET_PUT flag set and hence drop the device's usage
count to 1. Then, at dpm_complete stage, device_complete() would call
pm_runtime_put() to drop the device's usage count again to 0 and call
rpm_idle() to try to put the device into runtime PM suspend status. So,
the device could eventually stay at the runtime PM suspend status.

A real problematic case is the panel-simple.c driver(works with a
downstream DRM device driver), where the optional enable_gpio(controlled
by the runtime PM callbacks) will be disabled through pm_runtime_put()
called in device_complete() if a DRM atomic commit(triggered by a DRM
device's system resume callback) tries to do runtime PM resume for the
panel before the panel is active with force resume:

1) System suspend:
- pm_runtime_force_suspend()
- panel_simple_suspend() // enable_gpio is disabled

2) Runtime suspend with a DRM atomic commit:
- panel_simple_unprepare()
- pm_runtime_put_autosuspend() // drop device usage count to 1

3) Runtime resume with a DRM atomic commit:
- panel_simple_prepare()
- pm_runtime_get_sync() // increase device usage count to 2
- rpm_resume() // return -EACCES
- pm_runtime_put_autosuspend() // drop device usage count to 1

4) System resume:
- pm_runtime_force_resume()
- panel_simple_resume() // enable_gpio is enabled

5) PM transition complete:
- dpm_complete()
- device_complete()
- pm_runtime_put() // drop device usage count to 0
- rpm_idle()
- rpm_suspend() // start hrtimer with expires

6) hrtimer expires:
- pm_suspend_timer_fn()
- rpm_suspend() // queue work on pm_wq

7) work function is called:
- pm_runtime_work()
- rpm_suspend()
- panel_simple_suspend() // enable_gpio is disabled

Fix the issue by checking dev->power.needs_force_resume flag in
rpm_resume() so that it returns 1 instead of -EACCES in this scenario,
since the flag is set in pm_runtime_force_suspend(). Then, device
usage count will be 1 after pm_runtime_put() is called at dpm_complete
stage.

Also, update the documentation to change the description of
pm_runtime_resume() to reflect the new behavior of rpm_resume().

Cc: Rafael J. Wysocki <rafael@xxxxxxxxxx>
Cc: Len Brown <len.brown@xxxxxxxxx>
Cc: Pavel Machek <pavel@xxxxxx>
Cc: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>
Cc: Ulf Hansson <ulf.hansson@xxxxxxxxxx>
Signed-off-by: Liu Ying <victor.liu@xxxxxxx>
---
v1->v2:
* Fix commit message to tell the reason why the issue happens, that is,
zeroed device usage count in pm_runtime_put() at dpm_complete stage
eventually makes the device be in runtime PM suspend status.

Documentation/power/runtime_pm.rst | 14 +++++++-------
drivers/base/power/runtime.c | 2 ++
2 files changed, 9 insertions(+), 7 deletions(-)

diff --git a/Documentation/power/runtime_pm.rst b/Documentation/power/runtime_pm.rst
index 65b86e487afe..6266f0ac02a8 100644
--- a/Documentation/power/runtime_pm.rst
+++ b/Documentation/power/runtime_pm.rst
@@ -337,13 +337,13 @@ drivers/base/power/runtime.c and include/linux/pm_runtime.h:

`int pm_runtime_resume(struct device *dev);`
- execute the subsystem-level resume callback for the device; returns 0 on
- success, 1 if the device's runtime PM status is already 'active' (also if
- 'power.disable_depth' is nonzero, but the status was 'active' when it was
- changing from 0 to 1) or error code on failure, where -EAGAIN means it may
- be safe to attempt to resume the device again in future, but
- 'power.runtime_error' should be checked additionally, and -EACCES means
- that the callback could not be run, because 'power.disable_depth' was
- different from 0
+ success, 1 if the device's runtime PM status is assumed to be 'active'
+ with force resume or is already 'active' (also if 'power.disable_depth' is
+ nonzero, but the status was 'active' when it was changing from 0 to 1) or
+ error code on failure, where -EAGAIN means it may be safe to attempt to
+ resume the device again in future, but 'power.runtime_error' should be
+ checked additionally, and -EACCES means that the callback could not be
+ run, because 'power.disable_depth' was different from 0

`int pm_runtime_resume_and_get(struct device *dev);`
- run pm_runtime_resume(dev) and if successful, increment the device's
diff --git a/drivers/base/power/runtime.c b/drivers/base/power/runtime.c
index 997be3ac20a7..0bce66ea0036 100644
--- a/drivers/base/power/runtime.c
+++ b/drivers/base/power/runtime.c
@@ -762,6 +762,8 @@ static int rpm_resume(struct device *dev, int rpmflags)
repeat:
if (dev->power.runtime_error) {
retval = -EINVAL;
+ } else if (dev->power.needs_force_resume) {
+ retval = 1;
} else if (dev->power.disable_depth > 0) {
if (dev->power.runtime_status == RPM_ACTIVE &&
dev->power.last_status == RPM_ACTIVE)
--
2.37.1