Re: [PATCH 3/4] PM / core: Direct DPM_FLAG_SMART_SUSPEND optimization

From: Ulf Hansson
Date: Tue Dec 19 2017 - 08:10:58 EST


On 19 December 2017 at 12:13, Rafael J. Wysocki <rafael@xxxxxxxxxx> wrote:
> On Tue, Dec 19, 2017 at 8:38 AM, Ulf Hansson <ulf.hansson@xxxxxxxxxx> wrote:
>> On 10 December 2017 at 01:00, Rafael J. Wysocki <rjw@xxxxxxxxxxxxx> wrote:
>>> From: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx>
>>>
>>> Make the PM core avoid invoking the "late" and "noirq" system-wide
>>> suspend (or analogous) callbacks provided by device drivers directly
>>> for devices with DPM_FLAG_SMART_SUSPEND set that are in runtime
>>> suspend during the "late" and "noirq" phases of system-wide suspend
>>> (or analogous) transitions. That is only done for devices without
>>> any middle-layer "late" and "noirq" suspend callbacks (to avoid
>>> confusing the middle layer if there is one).
>>>
>>> The underlying observation is that runtime PM is disabled for devices
>>> during the "late" and "noirq" system-wide suspend phases, so if they
>>> remain in runtime suspend from the "late" phase forward, it doesn't
>>> make sense to invoke the "late" and "noirq" callbacks provided by
>>> the drivers for them (arguably, the device is already suspended and
>>> in the right state). Thus, if the remaining driver suspend callbacks
>>> are to be invoked directly by the core, they can be skipped.
>>>
>
> It looks like I'm consistently failing to explain my point clearly enough. :-)
>
>> As I have stated earlier, this isn't going to solve the general case,
>> as the above change log seems to state.
>
> Well, it doesn't say that or anything similar, at least to my eyes.
>
> The observation is that if you have set DPM_FLAG_SMART_SUSPEND, then
> you need to be prepared for your ->suspend_late and ->suspend_noirq to
> be skipped (because the ACPI PM domain does that and you may happen to
> work with it, for example) if the device is already suspended at the
> beginning of the "late suspend" phase. That's already documented.
>
> Given the above, and the fact that there is not much to be done for a
> suspended device in ->suspend_late and ->suspend_noirq, why can't the
> core skip these callbacks too if there's no middle layer?
>
> But the reason why I really need this is because
> i2c-designware-platdrv can work both with the ACPI PM domain and
> standalone and I need it to be handled consistently in both cases.

Yeah, I understand that.

>
>> I think we really need to do that, before adding yet another system suspend/resume optimization
>> path in the PM core.
>
> So what exactly is the technical argument in the above?

As stated, when the driver needs additional operations to be done, it
all falls a part.

>
>> The main reason is that lots of drivers have additional operations to
>> perform, besides making sure that its device is put into a "runtime
>> suspended state" during system suspend. In other words, skipping
>> system suspend callbacks (and likewise for system resume) is to me the
>> wrong solution to mitigate these problems.
>>
>>> This change really makes it possible for, say, platform device
>>> drivers to re-use runtime PM suspend and resume callbacks by
>>> pointing ->suspend_late and ->resume_early, respectively (and
>>> possibly the analogous hibernation-related callback pointers too),
>>> to them without adding any extra "is the device already suspended?"
>>> type of checks to the callback routines, as long as they will be
>>> invoked directly by the core.
>>
>> Certainly there are drivers that could start to deploying this
>> solution, because at the moment they don't have any additional
>> operations to perform at system suspend. But what about the rest,
>> don't we care about them?
>
> We do, somewhat. :-)
>
>> Moreover, what happens when/if a driver that has deployed this
>> solution, starts being used on a new SoC and then additional
>> operations during system suspend becomes required (for example
>> pinctrls that needs to be put in a system sleep state)? Then
>> everything falls apart, doesn't it?
>
> Then you runtime-resume the device in ->suspend() and the remaining
> callbacks will run for you just fine.

Well, this proves my concern.

Even if the driver has additional operations to perform, why should it
have to runtime resume its device to have the callbacks to be invoked?
That may be a waste in both time and power.

No, this isn't good enough, sorry.

>
> And IMO running "late" and "noirq" system suspend callbacks on a
> suspended device is super-fragile anyway as they generally need to
> distinguish between the "suspended" and "not suspended" cases
> consistently and what they do may affect the children or the parent of
> the device in ways that are difficult to predict in general. So, I'd
> rather not do that in any case.

That may be the case for the ACPI PM domain, but I don't see an issue
when devices are being attached to a more trivial middle layer/PM
domain.

Kind regards
Uffe