Re: [PATCH v2] PM: Move disabling/enabling runtime PM to suspend/resume noirq

From: Rafael J. Wysocki
Date: Wed Jul 03 2019 - 05:08:24 EST

On Wed, Jul 3, 2019 at 2:56 AM Muchun Song <smuchun@xxxxxxxxx> wrote:
> Rafael J. Wysocki <rafael@xxxxxxxxxx> ä2019å7æ3æåä äå1:54åéï
> >
> > On Tue, Jul 2, 2019 at 6:37 PM Muchun Song <smuchun@xxxxxxxxx> wrote:
> > >
> > > Currently, the PM core disables runtime PM for all devices right after
> > > executing subsystem/driver .suspend_late() callbacks for them and
> > > re-enables it right before executing subsystem/driver .resume_early()
> > > callbacks for them. This may lead to problems when there are two devices
> > > such that the irq handler thread function executed for one of them
> > > depends on runtime PM working for the other. E.g. There are two devices,
> > > one is i2c slave device depends on another device which can be the i2c
> > > adapter device. The slave device can generate system wakeup signals and
> > > is enabled to wake up the system(via call enable_irq_wake()). So, the irq
> > > of slave device is enabled. If a wakeup signal generate after executing
> > > subsystem/driver .suspend_late() callbacks. Then, the irq handler thread
> > > function will be called(The irq is requested via request_threaded_irq())
> > > and the slave device reads data via i2c adapter device(via i2c_transfer()).
> > > In that case, it may be failed to read data because of the runtime PM
> > > disabled.
> > >
> > > It is also analogously for resume. If a wakeup signal generate when the
> > > system is in the sleep state. The irq handler thread function may be
> > > called before executing subsystem/driver .resume_early(). In that case,
> > > it also may be failed to read data because of the runtime PM disabled.
> > >
> >
> > This has been discussed for a number of times, documented and no, I'm
> > not going to apply this patch.
> Thanks for your reply. I want to know why we can't do that, so where
> can I find the discussion?

The discussions rather.

You need to search email archives, especially for the reason why the
"late" and "early" callbacks have been added. There was a conference
presentation about that even.

The bottom line is that the change you are proposing cannot be made.

> > PM-runtime cannot be relied on during the "noirq" stages of suspend
> > and resume, which is why it is disabled by the core in the "late" and
> > "early" stages, respectively.
> >
> What better solution do we have for the example I am talking about
> which is described in the commit message?

It's not "what better solution is there", because what you are
proposing would simply not work.

Basically, if there are devices that may need the given one in the
"noirq" resume phase, this device needs to be resumed upfront. It may
be runtime-suspended again in the "early" resume phase.