Re: [Update][PATCH v2 01/12] PM / core: Add NEVER_SKIP and SMART_PREPARE driver flags
From: Ulf Hansson
Date: Mon Oct 23 2017 - 12:37:50 EST
On 19 October 2017 at 01:17, Rafael J. Wysocki <rjw@xxxxxxxxxxxxx> wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx>
>
> The motivation for this change is to provide a way to work around
> a problem with the direct-complete mechanism used for avoiding
> system suspend/resume handling for devices in runtime suspend.
>
> The problem is that some middle layer code (the PCI bus type and
> the ACPI PM domain in particular) returns positive values from its
> system suspend ->prepare callbacks regardless of whether the driver's
> ->prepare returns a positive value or 0, which effectively prevents
> drivers from being able to control the direct-complete feature.
> Some drivers need that control, however, and the PCI bus type has
> grown its own flag to deal with this issue, but since it is not
> limited to PCI, it is better to address it by adding driver flags at
> the core level.
>
> To that end, add a driver_flags field to struct dev_pm_info for flags
> that can be set by device drivers at the probe time to inform the PM
> core and/or bus types, PM domains and so on on the capabilities and/or
> preferences of device drivers. Also add two static inline helpers
> for setting that field and testing it against a given set of flags
> and make the driver core clear it automatically on driver remove
> and probe failures.
>
> Define and document two PM driver flags related to the direct-
> complete feature: NEVER_SKIP and SMART_PREPARE that can be used,
> respectively, to indicate to the PM core that the direct-complete
> mechanism should never be used for the device and to inform the
> middle layer code (bus types, PM domains etc) that it can only
> request the PM core to use the direct-complete mechanism for
> the device (by returning a positive value from its ->prepare
> callback) if it also has been requested by the driver.
>
> While at it, make the core check pm_runtime_suspended() when
> setting power.direct_complete so that it doesn't need to be
> checked by ->prepare callbacks.
>
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx>
> ---
>
> -> v2: Change the data type for driver_flags to u32 as suggested by Greg
> and fix a couple of documentation typos pointed out by Lukas.
>
> ---
> Documentation/driver-api/pm/devices.rst | 14 ++++++++++++++
> Documentation/power/pci.txt | 19 +++++++++++++++++++
> drivers/acpi/device_pm.c | 3 +++
> drivers/base/dd.c | 2 ++
> drivers/base/power/main.c | 4 +++-
> drivers/pci/pci-driver.c | 5 ++++-
> include/linux/device.h | 10 ++++++++++
> include/linux/pm.h | 20 ++++++++++++++++++++
> 8 files changed, 75 insertions(+), 2 deletions(-)
>
> Index: linux-pm/include/linux/device.h
> ===================================================================
> --- linux-pm.orig/include/linux/device.h
> +++ linux-pm/include/linux/device.h
> @@ -1070,6 +1070,16 @@ static inline void dev_pm_syscore_device
> #endif
> }
>
> +static inline void dev_pm_set_driver_flags(struct device *dev, u32 flags)
> +{
> + dev->power.driver_flags = flags;
> +}
> +
> +static inline bool dev_pm_test_driver_flags(struct device *dev, u32 flags)
> +{
> + return !!(dev->power.driver_flags & flags);
> +}
> +
> static inline void device_lock(struct device *dev)
> {
> mutex_lock(&dev->mutex);
> Index: linux-pm/include/linux/pm.h
> ===================================================================
> --- linux-pm.orig/include/linux/pm.h
> +++ linux-pm/include/linux/pm.h
> @@ -550,6 +550,25 @@ struct pm_subsys_data {
> #endif
> };
>
> +/*
> + * Driver flags to control system suspend/resume behavior.
> + *
> + * These flags can be set by device drivers at the probe time. They need not be
> + * cleared by the drivers as the driver core will take care of that.
> + *
> + * NEVER_SKIP: Do not skip system suspend/resume callbacks for the device.
> + * SMART_PREPARE: Check the return value of the driver's ->prepare callback.
> + *
> + * Setting SMART_PREPARE instructs bus types and PM domains which may want
> + * system suspend/resume callbacks to be skipped for the device to return 0 from
> + * their ->prepare callbacks if the driver's ->prepare callback returns 0 (in
> + * other words, the system suspend/resume callbacks can only be skipped for the
> + * device if its driver doesn't object against that). This flag has no effect
> + * if NEVER_SKIP is set.
In principle ACPI/PCI middle-layer/PM domain could have started out by
respecting the return values from driver's ->prepare() callbacks in
case those existed, but they didn't, and that is the reason to why the
SMART_PREPARE is needed. Right?
My point is, I don't think we should encourage other middle-layer to
support the SMART_PREPARE flag, simply because they should be able to
cope without it. To make this more obvious we could try to find a
different name of the flag indicating that, or at least make it clear
that we don't want it to be used by others than ACPI/PCI via
documenting that.
> + */
> +#define DPM_FLAG_NEVER_SKIP BIT(0)
> +#define DPM_FLAG_SMART_PREPARE BIT(1)
> +
> struct dev_pm_info {
> pm_message_t power_state;
> unsigned int can_wakeup:1;
> @@ -561,6 +580,7 @@ struct dev_pm_info {
> bool is_late_suspended:1;
> bool early_init:1; /* Owned by the PM core */
> bool direct_complete:1; /* Owned by the PM core */
> + u32 driver_flags;
> spinlock_t lock;
> #ifdef CONFIG_PM_SLEEP
> struct list_head entry;
> Index: linux-pm/drivers/base/dd.c
> ===================================================================
> --- linux-pm.orig/drivers/base/dd.c
> +++ linux-pm/drivers/base/dd.c
> @@ -464,6 +464,7 @@ pinctrl_bind_failed:
> if (dev->pm_domain && dev->pm_domain->dismiss)
> dev->pm_domain->dismiss(dev);
> pm_runtime_reinit(dev);
> + dev_pm_set_driver_flags(dev, 0);
>
> switch (ret) {
> case -EPROBE_DEFER:
> @@ -869,6 +870,7 @@ static void __device_release_driver(stru
> if (dev->pm_domain && dev->pm_domain->dismiss)
> dev->pm_domain->dismiss(dev);
> pm_runtime_reinit(dev);
> + dev_pm_set_driver_flags(dev, 0);
>
> klist_remove(&dev->p->knode_driver);
> device_pm_check_callbacks(dev);
> Index: linux-pm/drivers/base/power/main.c
> ===================================================================
> --- linux-pm.orig/drivers/base/power/main.c
> +++ linux-pm/drivers/base/power/main.c
> @@ -1700,7 +1700,9 @@ unlock:
> * applies to suspend transitions, however.
> */
> spin_lock_irq(&dev->power.lock);
> - dev->power.direct_complete = ret > 0 && state.event == PM_EVENT_SUSPEND;
> + dev->power.direct_complete = state.event == PM_EVENT_SUSPEND &&
> + pm_runtime_suspended(dev) && ret > 0 &&
> + !dev_pm_test_driver_flags(dev, DPM_FLAG_NEVER_SKIP);
> spin_unlock_irq(&dev->power.lock);
> return 0;
> }
> Index: linux-pm/drivers/pci/pci-driver.c
> ===================================================================
> --- linux-pm.orig/drivers/pci/pci-driver.c
> +++ linux-pm/drivers/pci/pci-driver.c
> @@ -682,8 +682,11 @@ static int pci_pm_prepare(struct device
>
> if (drv && drv->pm && drv->pm->prepare) {
> int error = drv->pm->prepare(dev);
> - if (error)
> + if (error < 0)
> return error;
> +
> + if (!error && dev_pm_test_driver_flags(dev, DPM_FLAG_SMART_PREPARE))
> + return 0;
> }
> return pci_dev_keep_suspended(to_pci_dev(dev));
> }
> Index: linux-pm/drivers/acpi/device_pm.c
> ===================================================================
> --- linux-pm.orig/drivers/acpi/device_pm.c
> +++ linux-pm/drivers/acpi/device_pm.c
> @@ -965,6 +965,9 @@ int acpi_subsys_prepare(struct device *d
> if (ret < 0)
> return ret;
>
> + if (!ret && dev_pm_test_driver_flags(dev, DPM_FLAG_SMART_PREPARE))
> + return 0;
So if the driver don't implement the ->prepare() callback, you still
want to treat this flag as it has one assigned and that it returns 0?
It seems not entirely according to what you have documented about the flag.
> +
> if (!adev || !pm_runtime_suspended(dev))
> return 0;
>
> Index: linux-pm/Documentation/driver-api/pm/devices.rst
> ===================================================================
> --- linux-pm.orig/Documentation/driver-api/pm/devices.rst
> +++ linux-pm/Documentation/driver-api/pm/devices.rst
> @@ -354,6 +354,20 @@ the phases are: ``prepare``, ``suspend``
> is because all such devices are initially set to runtime-suspended with
> runtime PM disabled.
>
> + This feature also can be controlled by device drivers by using the
> + ``DPM_FLAG_NEVER_SKIP`` and ``DPM_FLAG_SMART_PREPARE`` driver power
> + management flags. [Typically, they are set at the time the driver is
> + probed against the device in question by passing them to the
> + :c:func:`dev_pm_set_driver_flags` helper function.] If the first of
> + these flags is set, the PM core will not apply the direct-complete
> + procedure described above to the given device and, consequenty, to any
> + of its ancestors. The second flag, when set, informs the middle layer
> + code (bus types, device types, PM domains, classes) that it should take
> + the return value of the ``->prepare`` callback provided by the driver
> + into account and it may only return a positive value from its own
> + ``->prepare`` callback if the driver's one also has returned a positive
> + value.
> +
> 2. The ``->suspend`` methods should quiesce the device to stop it from
> performing I/O. They also may save the device registers and put it into
> the appropriate low-power state, depending on the bus type the device is
> Index: linux-pm/Documentation/power/pci.txt
> ===================================================================
> --- linux-pm.orig/Documentation/power/pci.txt
> +++ linux-pm/Documentation/power/pci.txt
> @@ -961,6 +961,25 @@ dev_pm_ops to indicate that one suspend
> .suspend(), .freeze(), and .poweroff() members and one resume routine is to
> be pointed to by the .resume(), .thaw(), and .restore() members.
>
> +3.1.19. Driver Flags for Power Management
> +
> +The PM core allows device drivers to set flags that influence the handling of
> +power management for the devices by the core itself and by middle layer code
> +including the PCI bus type. The flags should be set once at the driver probe
> +time with the help of the dev_pm_set_driver_flags() function and they should not
> +be updated directly afterwards.
I am wondering if we really need to make a statement generic to all
"driver PM flags" that these flags must be set at ->probe(). Maybe
that is better documented per flag, rather than for all. The reason
why I bring it up, is that I would not be surprised if a new flag
comes a long and which may be used a bit differently, not requiring
that.
Of course we can also update that later on, if needed.
> +
> +The DPM_FLAG_NEVER_SKIP flag prevents the PM core from using the direct-complete
> +mechanism allowing device suspend/resume callbacks to be skipped if the device
> +is in runtime suspend when the system suspend starts. That also affects all of
> +the ancestors of the device, so this flag should only be used if absolutely
> +necessary.
> +
> +The DPM_FLAG_SMART_PREPARE flag instructs the PCI bus type to only return a
> +positive value from pci_pm_prepare() if the ->prepare callback provided by the
> +driver of the device returns a positive value. That allows the driver to opt
> +out from using the direct-complete mechanism dynamically.
> +
> 3.2. Device Runtime Power Management
> ------------------------------------
> In addition to providing device power management callbacks PCI device drivers
>
Kind regards
Uffe