Re: [PATCH v4 1/6] PM / core: Add LEAVE_SUSPENDED driver flag

From: Ulf Hansson
Date: Mon Nov 20 2017 - 07:25:43 EST


On 18 November 2017 at 15:31, Rafael J. Wysocki <rjw@xxxxxxxxxxxxx> wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx>
>
> Define and document a new driver flag, DPM_FLAG_LEAVE_SUSPENDED, to
> instruct the PM core and middle-layer (bus type, PM domain, etc.)
> code that it is desirable to leave the device in runtime suspend
> after system-wide transitions to the working state (for example,
> the device may be slow to resume and it may be better to avoid
> resuming it right away).
>
> Generally, the middle-layer code involved in the handling of the
> device is expected to indicate to the PM core whether or not the
> device may be left in suspend with the help of the device's
> power.may_skip_resume status bit. That has to happen in the "noirq"
> phase of the preceding system suspend (or analogous) transition.
> The middle layer is then responsible for handling the device as
> appropriate in its "noirq" resume callback which is executed
> regardless of whether or not the device may be left suspended, but
> the other resume callbacks (except for ->complete) will be skipped
> automatically by the core if the device really can be left in
> suspend.
>
> The additional power.must_resume status bit introduced for the
> implementation of this mechanisn is used internally by the PM core
> to track the requirement to resume the device (which may depend on
> its children etc).
>
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx>
> Acked-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>

Reviewed-by: Ulf Hansson <ulf.hansson@xxxxxxxxxx>

Kind regards
Uffe

> ---
>
> v3 -> v4: Fix the dev->power.usage_count check added in v3, clarify
> documentation, add comment to explain why the runtime PM status
> is changed in device_resume_noirq() and make the description of
> the NEVER_SKIP flag more precise.
>
> v2 -> v3: Take dev->power.usage_count when updating power.must_resume in
> __device_suspend_noirq().
>
> ---
> Documentation/driver-api/pm/devices.rst | 27 ++++++++++-
> drivers/base/power/main.c | 73 +++++++++++++++++++++++++++++---
> include/linux/pm.h | 16 +++++--
> 3 files changed, 105 insertions(+), 11 deletions(-)
>
> Index: linux-pm/include/linux/pm.h
> ===================================================================
> --- linux-pm.orig/include/linux/pm.h
> +++ linux-pm/include/linux/pm.h
> @@ -556,9 +556,10 @@ struct pm_subsys_data {
> * These flags can be set by device drivers at the probe time. They need not be
> * cleared by the drivers as the driver core will take care of that.
> *
> - * NEVER_SKIP: Do not skip system suspend/resume callbacks for the device.
> + * NEVER_SKIP: Do not skip all system suspend/resume callbacks for the device.
> * SMART_PREPARE: Check the return value of the driver's ->prepare callback.
> * SMART_SUSPEND: No need to resume the device from runtime suspend.
> + * LEAVE_SUSPENDED: Avoid resuming the device during system resume if possible.
> *
> * Setting SMART_PREPARE instructs bus types and PM domains which may want
> * system suspend/resume callbacks to be skipped for the device to return 0 from
> @@ -572,10 +573,14 @@ struct pm_subsys_data {
> * necessary from the driver's perspective. It also may cause them to skip
> * invocations of the ->suspend_late and ->suspend_noirq callbacks provided by
> * the driver if they decide to leave the device in runtime suspend.
> + *
> + * Setting LEAVE_SUSPENDED informs the PM core and middle-layer code that the
> + * driver prefers the device to be left in suspend after system resume.
> */
> -#define DPM_FLAG_NEVER_SKIP BIT(0)
> -#define DPM_FLAG_SMART_PREPARE BIT(1)
> -#define DPM_FLAG_SMART_SUSPEND BIT(2)
> +#define DPM_FLAG_NEVER_SKIP BIT(0)
> +#define DPM_FLAG_SMART_PREPARE BIT(1)
> +#define DPM_FLAG_SMART_SUSPEND BIT(2)
> +#define DPM_FLAG_LEAVE_SUSPENDED BIT(3)
>
> struct dev_pm_info {
> pm_message_t power_state;
> @@ -597,6 +602,8 @@ struct dev_pm_info {
> bool wakeup_path:1;
> bool syscore:1;
> bool no_pm_callbacks:1; /* Owned by the PM core */
> + unsigned int must_resume:1; /* Owned by the PM core */
> + unsigned int may_skip_resume:1; /* Set by subsystems */
> #else
> unsigned int should_wakeup:1;
> #endif
> @@ -765,6 +772,7 @@ extern int pm_generic_poweroff_late(stru
> extern int pm_generic_poweroff(struct device *dev);
> extern void pm_generic_complete(struct device *dev);
>
> +extern bool dev_pm_may_skip_resume(struct device *dev);
> extern bool dev_pm_smart_suspend_and_suspended(struct device *dev);
>
> #else /* !CONFIG_PM_SLEEP */
> Index: linux-pm/drivers/base/power/main.c
> ===================================================================
> --- linux-pm.orig/drivers/base/power/main.c
> +++ linux-pm/drivers/base/power/main.c
> @@ -528,6 +528,18 @@ static void dpm_watchdog_clear(struct dp
> /*------------------------- Resume routines -------------------------*/
>
> /**
> + * dev_pm_may_skip_resume - System-wide device resume optimization check.
> + * @dev: Target device.
> + *
> + * Checks whether or not the device may be left in suspend after a system-wide
> + * transition to the working state.
> + */
> +bool dev_pm_may_skip_resume(struct device *dev)
> +{
> + return !dev->power.must_resume && pm_transition.event != PM_EVENT_RESTORE;
> +}
> +
> +/**
> * device_resume_noirq - Execute a "noirq resume" callback for given device.
> * @dev: Device to handle.
> * @state: PM transition of the system being carried out.
> @@ -575,6 +587,19 @@ static int device_resume_noirq(struct de
> error = dpm_run_callback(callback, dev, state, info);
> dev->power.is_noirq_suspended = false;
>
> + if (dev_pm_may_skip_resume(dev)) {
> + /*
> + * The device is going to be left in suspend, but it might not
> + * have been in runtime suspend before the system suspended, so
> + * its runtime PM status needs to be updated to avoid confusing
> + * the runtime PM framework when runtime PM is enabled for the
> + * device again.
> + */
> + pm_runtime_set_suspended(dev);
> + dev->power.is_late_suspended = false;
> + dev->power.is_suspended = false;
> + }
> +
> Out:
> complete_all(&dev->power.completion);
> TRACE_RESUME(error);
> @@ -1076,6 +1101,22 @@ static pm_message_t resume_event(pm_mess
> return PMSG_ON;
> }
>
> +static void dpm_superior_set_must_resume(struct device *dev)
> +{
> + struct device_link *link;
> + int idx;
> +
> + if (dev->parent)
> + dev->parent->power.must_resume = true;
> +
> + idx = device_links_read_lock();
> +
> + list_for_each_entry_rcu(link, &dev->links.suppliers, c_node)
> + link->supplier->power.must_resume = true;
> +
> + device_links_read_unlock(idx);
> +}
> +
> /**
> * __device_suspend_noirq - Execute a "noirq suspend" callback for given device.
> * @dev: Device to handle.
> @@ -1127,10 +1168,28 @@ static int __device_suspend_noirq(struct
> }
>
> error = dpm_run_callback(callback, dev, state, info);
> - if (!error)
> - dev->power.is_noirq_suspended = true;
> - else
> + if (error) {
> async_error = error;
> + goto Complete;
> + }
> +
> + dev->power.is_noirq_suspended = true;
> +
> + if (dev_pm_test_driver_flags(dev, DPM_FLAG_LEAVE_SUSPENDED)) {
> + /*
> + * The only safe strategy here is to require that if the device
> + * may not be left in suspend, resume callbacks must be invoked
> + * for it.
> + */
> + dev->power.must_resume = dev->power.must_resume ||
> + !dev->power.may_skip_resume ||
> + atomic_read(&dev->power.usage_count) > 1;
> + } else {
> + dev->power.must_resume = true;
> + }
> +
> + if (dev->power.must_resume)
> + dpm_superior_set_must_resume(dev);
>
> Complete:
> complete_all(&dev->power.completion);
> @@ -1487,6 +1546,9 @@ static int __device_suspend(struct devic
> dev->power.direct_complete = false;
> }
>
> + dev->power.may_skip_resume = false;
> + dev->power.must_resume = false;
> +
> dpm_watchdog_set(&wd, dev);
> device_lock(dev);
>
> @@ -1652,8 +1714,9 @@ static int device_prepare(struct device
> if (dev->power.syscore)
> return 0;
>
> - WARN_ON(dev_pm_test_driver_flags(dev, DPM_FLAG_SMART_SUSPEND) &&
> - !pm_runtime_enabled(dev));
> + WARN_ON(!pm_runtime_enabled(dev) &&
> + dev_pm_test_driver_flags(dev, DPM_FLAG_SMART_SUSPEND |
> + DPM_FLAG_LEAVE_SUSPENDED));
>
> /*
> * If a device's parent goes into runtime suspend at the wrong time,
> Index: linux-pm/Documentation/driver-api/pm/devices.rst
> ===================================================================
> --- linux-pm.orig/Documentation/driver-api/pm/devices.rst
> +++ linux-pm/Documentation/driver-api/pm/devices.rst
> @@ -788,6 +788,29 @@ must reflect the "active" status for run
>
> During system-wide resume from a sleep state it's easiest to put devices into
> the full-power state, as explained in :file:`Documentation/power/runtime_pm.txt`.
> -Refer to that document for more information regarding this particular issue as
> +[Refer to that document for more information regarding this particular issue as
> well as for information on the device runtime power management framework in
> -general.
> +general.]
> +
> +However, it often is desirable to leave devices in suspend after system
> +transitions to the working state, especially if those devices had been in
> +runtime suspend before the preceding system-wide suspend (or analogous)
> +transition. Device drivers can use the ``DPM_FLAG_LEAVE_SUSPENDED`` flag to
> +indicate to the PM core (and middle-layer code) that they prefer the specific
> +devices handled by them to be left suspended and they have no problems with
> +skipping their system-wide resume callbacks for this reason. Whether or not the
> +devices will actually be left in suspend may depend on their state before the
> +given system suspend-resume cycle and on the type of the system transition under
> +way. In particular, devices are not left suspended if that transition is a
> +restore from hibernation, as device states are not guaranteed to be reflected
> +by the information stored in the hibernation image in that case.
> +
> +The middle-layer code involved in the handling of the device is expected to
> +indicate to the PM core if the device may be left in suspend by setting its
> +:c:member:`power.may_skip_resume` status bit which is checked by the PM core
> +during the "noirq" phase of the preceding system-wide suspend (or analogous)
> +transition. The middle layer is then responsible for handling the device as
> +appropriate in its "noirq" resume callback, which is executed regardless of
> +whether or not the device is left suspended, but the other resume callbacks
> +(except for ``->complete``) will be skipped automatically by the PM core if the
> +device really can be left in suspend.
>