Re: [PATCH] ARM: OMAP2+: omap_device: maintain sane runtime pm status around suspend/resume
From: Kevin Hilman
Date: Thu Nov 07 2013 - 18:44:58 EST
Nishanth Menon <nm@xxxxxx> writes:
> OMAP device hooks around suspend|resume_noirq ensures that hwmod
> devices are forced to idle using omap_device_idle/enable as part of
> the last stage of suspend activity.
>
> For a device such as i2c who uses autosuspend, it is possible to enter
> the suspend path with dev->power.runtime_status = RPM_ACTIVE.
>
> As part of the suspend flow, the generic runtime logic would increment
> it's dev->power.disable_depth to 1. This should prevent further
> pm_runtime_get_sync from succeeding once the runtime_status has been
> set to RPM_SUSPENDED.
>
> Now, as part of the suspend_noirq handler in omap_device, we force the
> following: if the device status is !suspended, we force the device
> to idle using omap_device_idle (clocks are cut etc..). This ensures
> that from a hardware perspective, the device is "suspended". However,
> runtime_status is left to be active.
>
> *if* an operation is attempted after this point to
> pm_runtime_get_sync, runtime framework depends on runtime_status to
> indicate accurately the device status, and since it sees it to be
> ACTIVE, it assumes the module is functional and returns a non-error
> value. As a result the user will see pm_runtime_get succeed, however a
> register access will crash due to the lack of clocks.
Ouch.
Dumb Q: who is requesting an i2c transaction after ->suspend_noirq().
The i2c driver itself should be able to detect that it's being accessed
after this point and return an error.
That being said, I agree that omap_device should still be catching this
case in order to find/fix driver races like this.
> To prevent this from happening, we should ensure that runtime_status
> exactly indicates the device status. As a result of this change
> any further calls to pm_runtime_get* would return -EACCES (since
> disable_depth is 1). On resume, we restore the clocks and runtime
> status exactly as we suspended with.
>
> Reported-by: J Keerthy <j-keerthy@xxxxxx>
> Signed-off-by: Nishanth Menon <nm@xxxxxx>
> Acked-by: Rajendra Nayak <rnayak@xxxxxx>
> ---
> patch baseline: V3.12 tag (also applies on linux-next next-20131107 tag)
>
> Logs from 3.12 based vendor kernel:
> Before: http://pastebin.com/m5KxnB7a
> After: http://pastebin.com/8AfX4e7r
>
> The discussion on cpufreq side of the story which triggered this scenario:
> http://marc.info/?l=linux-pm&m=138263811321921&w=2
>
> Tested on TI vendor kernel (with dt boot):
> AM335x: evm, BBB, sk, BBW
> OMAP5uEVM, DRA7-evm
>
> arch/arm/mach-omap2/omap_device.c | 16 ++++++++++++++--
> arch/arm/mach-omap2/omap_device.h | 1 +
> 2 files changed, 15 insertions(+), 2 deletions(-)
>
> diff --git a/arch/arm/mach-omap2/omap_device.c b/arch/arm/mach-omap2/omap_device.c
> index b69dd9a..87ecbb0 100644
> --- a/arch/arm/mach-omap2/omap_device.c
> +++ b/arch/arm/mach-omap2/omap_device.c
> @@ -621,6 +621,13 @@ static int _od_suspend_noirq(struct device *dev)
>
> if (!ret && !pm_runtime_status_suspended(dev)) {
> if (pm_generic_runtime_suspend(dev) == 0) {
> + if (!pm_runtime_suspended(dev)) {
Why the addition check for supended here?
This version (as opposed to the _status_suspended() version above will
fail if runtime PM has been disabled from userspace (via
/sys/devices/.../power/control), and will thus prevent low power states
from being hit in suspend if runtime suspend has been disabled from
userspace. That's a bug.
> + /* NOTE: *might* indicate driver race */
Yes, a driver race which should then be fixed in the driver.
> + dev_dbg(dev, "%s: Force suspending\n",
> + __func__);
> + pm_runtime_set_suspended(dev);
> + od->flags |= OMAP_DEVICE_SUSPEND_FORCED;
Not sure why you need an additonal flag. Why not just always do this
and use the existin OMAP_DEVICE_SUSPENDED flag.
Kevin
> + }
> omap_device_idle(pdev);
> od->flags |= OMAP_DEVICE_SUSPENDED;
> }
> @@ -634,10 +641,15 @@ static int _od_resume_noirq(struct device *dev)
> struct platform_device *pdev = to_platform_device(dev);
> struct omap_device *od = to_omap_device(pdev);
>
> - if ((od->flags & OMAP_DEVICE_SUSPENDED) &&
> - !pm_runtime_status_suspended(dev)) {
> + if (od->flags & OMAP_DEVICE_SUSPENDED) {
> od->flags &= ~OMAP_DEVICE_SUSPENDED;
> omap_device_enable(pdev);
> +
> + if (od->flags & OMAP_DEVICE_SUSPEND_FORCED) {
> + pm_runtime_set_active(dev);
> + od->flags &= ~OMAP_DEVICE_SUSPEND_FORCED;
> + }
> +
> pm_generic_runtime_resume(dev);
> }
>
> diff --git a/arch/arm/mach-omap2/omap_device.h b/arch/arm/mach-omap2/omap_device.h
> index 17ca1ae..45885b0 100644
> --- a/arch/arm/mach-omap2/omap_device.h
> +++ b/arch/arm/mach-omap2/omap_device.h
> @@ -38,6 +38,7 @@ extern struct dev_pm_domain omap_device_pm_domain;
>
> /* omap_device.flags values */
> #define OMAP_DEVICE_SUSPENDED BIT(0)
> +#define OMAP_DEVICE_SUSPEND_FORCED BIT(1)
>
> /**
> * struct omap_device - omap_device wrapper for platform_devices
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/