Re: [PATCH 2/2] bus: mhi: host: pci_generic: Recover the device synchronously from mhi_pci_runtime_resume()

From: Manivannan Sadhasivam
Date: Wed Jan 08 2025 - 11:02:29 EST


On Wed, Jan 08, 2025 at 04:19:06PM +0100, Loic Poulain wrote:
> On Wed, 8 Jan 2025 at 14:39, Manivannan Sadhasivam via B4 Relay
> <devnull+manivannan.sadhasivam.linaro.org@xxxxxxxxxx> wrote:
> >
> > From: Manivannan Sadhasivam <manivannan.sadhasivam@xxxxxxxxxx>
> >
> > Currently, in mhi_pci_runtime_resume(), if the resume fails, recovery_work
> > is started asynchronously and success is returned. But this doesn't align
> > with what PM core expects as documented in
> > Documentation/power/runtime_pm.rst:
> >
> > "Once the subsystem-level resume callback (or the driver resume callback,
> > if invoked directly) has completed successfully, the PM core regards the
> > device as fully operational, which means that the device _must_ be able to
> > complete I/O operations as needed. The runtime PM status of the device is
> > then 'active'."
> >
> > So the PM core ends up marking the runtime PM status of the device as
> > 'active', even though the device is not able to handle the I/O operations.
> > This same condition more or less applies to system resume as well.
> >
> > So to avoid this ambiguity, try to recover the device synchronously from
> > mhi_pci_runtime_resume() and return the actual error code in the case of
> > recovery failure.
> >
> > For doing so, move the recovery code to __mhi_pci_recovery_work() helper
> > and call that from both mhi_pci_recovery_work() and
> > mhi_pci_runtime_resume(). Former still ignores the return value, while the
> > latter passes it to PM core.
> >
> > Cc: stable@xxxxxxxxxxxxxxx # 5.13
> > Reported-by: Johan Hovold <johan@xxxxxxxxxx>
> > Closes: https://lore.kernel.org/mhi/Z2PbEPYpqFfrLSJi@xxxxxxxxxxxxxxxxxxxx
> > Fixes: d3800c1dce24 ("bus: mhi: pci_generic: Add support for runtime PM")
> > Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@xxxxxxxxxx>
>
> Note that it will noticeably impact the user experience on system-wide
> resume (mhi_pci_resume), because MHI devices usually take a while (a
> few seconds) to cold boot and reach a ready state (or time out in the
> worst case). So we may have people complaining about delayed resume
> regression on their laptop even if they are not using the MHI
> device/modem function. Are we ok with that?
>

Are you saying that the modem will enter D3Cold all the time during system
suspend? I think you are referring to x86 host machines here.

If that is the case, we should not be using mhi_pci_runtime_*() calls in
mhi_pci_suspend/resume(). Rather the MHI stack should be powered down during
suspend and powered ON during resume.

- Mani

--
மணிவண்ணன் சதாசிவம்