Re: [PATCH v2 0/4] PCI: Introduce pci_suspend_retains_context() API

From: Hans Zhang

Date: Fri May 22 2026 - 12:25:15 EST


Hi Mani,

We previously discussed a patch. I wonder if you have any memory of it. I'm not sure if it can solve my problem. As shown below:

https://lore.kernel.org/linux-pci/z4bq25pr35cklwoodz34pnfaopfrtbjwhc6gvbhbsvnwblhxia@frmtb3t3m4nk/

"""
> Hans: Before I added the printk for debugging, it hung here.
>
>
> I added the log output after debugging printk.
>
> Sky1 SOC Root Port driver's suspend function: sky1_pcie_suspend_noirq
> Our hardware is in STR(suspend to ram), and the controller and PHY will lose
> power.
>
> So in sky1_pcie_suspend_noirq, the AXI,APB clock, etc. of the PCIe
> controller will be turned off. In sky1_pcie_resume_noirq, the PCIe
> controller and PHY will be reinitialized. If suspend does not close the AXI
> and APB clock, and the AXI is reopened during the resume process, the APB
> clock will cause the reference count of the kernel API to accumulate
> continuously.
>

So this is the actual issue (controller loosing power during system suspend) and
everything else (ASPM, MSIX write) are all side effects of it.

Yes, this issue is more common with several vendors and we need to come up with
a generic solution instead of hacking up the client drivers. I'm planning to
work on it in the coming days. Will keep you in the loop.

- Mani
"""


Best regards,
Hans

On 5/19/26 16:11, Manivannan Sadhasivam via B4 Relay wrote:
Hi all,

This series introduces a new PCI API, pci_suspend_retains_context() to
let the client drivers know whether they can expect context retention across
suspend/resume or not and uses it in the NVMe PCI host driver.

This new API is targeted to abstract the PCI power management details away from
the client drivers. This is needed because client drivers like NVMe make use of
APIs such as pm_suspend_via_firmware() and decide to keep the device in low
power mode if this API returns 'false'. But some platforms may have other
limitations like in the case of Qcom, where if the RC driver removes the PCIe RC
resource vote to allow the SoC to enter low power mode, it cannot reliably exit
the L1ss state when the endpoint asserts CLKREQ#. So in this case also, the
client drivers cannot keep the device in low power state during suspend and
expect context retention.

And these limitations may just keep adding in the future. Without a unified
API, the client drivers have to implement their own logic which may cause code
duplication and may also lead to drivers missing some of the platform
limitations.

Once this series gets merged, we can extend this API usage to other client
drivers as well.

Testing
=======

This series is tested on Qualcomm Hamoa based Lenovo Thinkpad T14s latop with
NVMe drive.

Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@xxxxxxxxxxxxxxxx>
---
Changes in v2:
- Renamed the API to pci_suspend_retains_context()
- Reworded the commit messages to include L10_REFCLK_ON + T_COMMONMODE as the
L1ss exit latency
- Rebased on top of v7.1-rc1

---
Manivannan Sadhasivam (4):
PCI: Introduce an API to check if RC/platform can retain device context during suspend
PCI: Indicate context lost if L1ss exit is broken during resume from system suspend
PCI: qcom: Indicate broken L1ss exit during resume from system suspend
nvme-pci: Use pci_suspend_retains_context() API during suspend

drivers/nvme/host/pci.c | 3 ++-
drivers/pci/controller/dwc/pcie-qcom.c | 12 ++++++++++++
drivers/pci/pci.c | 34 ++++++++++++++++++++++++++++++++++
include/linux/pci.h | 9 +++++++++
4 files changed, 57 insertions(+), 1 deletion(-)
---
base-commit: 254f49634ee16a731174d2ae34bc50bd5f45e731
change-id: 20260414-l1ss-fix-6c9cf2451944

Best regards,