Re: [PATCH v2 2/3] PCI: qcom: Check for the presence of a device instead of Link up during suspend

From: Manivannan Sadhasivam

Date: Wed Nov 19 2025 - 11:15:25 EST


On Wed, Nov 19, 2025 at 04:03:22PM +0800, zhangsenchuan wrote:
>
>
>
> > -----Original Messages-----
> > From: "Manivannan Sadhasivam" <mani@xxxxxxxxxx>
> > Send time:Tuesday, 18/11/2025 01:37:01
> > To: "Bjorn Helgaas" <helgaas@xxxxxxxxxx>
> > Cc: "Manivannan Sadhasivam" <manivannan.sadhasivam@xxxxxxxxxxxxxxxx>, lpieralisi@xxxxxxxxxx, kwilczynski@xxxxxxxxxx, bhelgaas@xxxxxxxxxx, will@xxxxxxxxxx, linux-pci@xxxxxxxxxxxxxxx, linux-kernel@xxxxxxxxxxxxxxx, robh@xxxxxxxxxx, linux-arm-msm@xxxxxxxxxxxxxxx, zhangsenchuan@xxxxxxxxxxxxxxxxxx, vincent.guittot@xxxxxxxxxx
> > Subject: Re: [PATCH v2 2/3] PCI: qcom: Check for the presence of a device instead of Link up during suspend
> >
> > On Thu, Nov 13, 2025 at 11:22:50AM -0600, Bjorn Helgaas wrote:
> > > On Thu, Nov 13, 2025 at 10:24:17PM +0530, Manivannan Sadhasivam wrote:
> > > > On Thu, Nov 13, 2025 at 10:41:47AM -0600, Bjorn Helgaas wrote:
> > > > > On Fri, Nov 07, 2025 at 10:13:18AM +0530, Manivannan Sadhasivam wrote:
> > > > > > The suspend handler checks for the PCIe Link up to decide when to turn off
> > > > > > the controller resources. But this check is racy as the PCIe Link can go
> > > > > > down just after this check.
> > > > > >
> > > > > > So use the newly introduced API, pci_root_ports_have_device() that checks
> > > > > > for the presence of a device under any of the Root Ports to replace the
> > > > > > Link up check.
> > > > >
> > > > > Why is pci_root_ports_have_device() itself not racy?
> > > >
> > > > Because it is very uncommon for the 'pci_dev' to go away during the
> > > > host controller suspend. It might still be possible in edge cases,
> > > > but very common as the link down. I can reword it.
> > >
> > > I guess it's better to acknowledge replacing one race with another
> > > than it would be to suggest that this *removes* a race.
> > >
> >
> > Ok.
> >
> > > But I don't understand the point of this. Is
> > > pci_root_ports_have_device() *less* racy than the
> > > qcom_pcie_suspend_noirq() check? Why would that be?
> > >
> >
> > The check is supposed to perform deinit only if there are no devices connected
> > to the slot. And the reason to skip the deinit was mostly due to driver behavior
> > like NVMe driver, which expects the device to be in D0 even during system
> > suspend on non-x86 platforms.
> >
> > Since the check is for the existence of the device nevertheless, I thought,
> > making use of pci_root_ports_have_device() serves the purpose instead of
> > checking the data link layer status.
> >
> > > I'm kind of skeptical about adding pci_root_ports_have_device() at
> > > all. It seems like it just encourages racy behavior in drivers.
> > >
> >
> > I agree that though it is not very common, but with async suspend, it is
> > possible that 'pci_dev' may get removed during controller suspend.
> >
> > So I've dropped this series from controller/dwc until we conclude.
> >
>
> Hi, Mani
>
> I see that this series from controller/dwc has been temporarily removed.
> Do I need to wait for a conclusion later before submitting the code, or
> do I need to continue submitting the pcie v6 patch based on the latest
> 6.18-rc6 branch?
>

I'm coming up with a new version of the patch, but you do not need to wait for
it.

- Mani

> Kind regards,
> Senchuan Zhang
>
> >
> > > > > > Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@xxxxxxxxxxxxxxxx>
> > > > > > ---
> > > > > > drivers/pci/controller/dwc/pcie-qcom.c | 6 ++++--
> > > > > > 1 file changed, 4 insertions(+), 2 deletions(-)
> > > > > >
> > > > > > diff --git a/drivers/pci/controller/dwc/pcie-qcom.c b/drivers/pci/controller/dwc/pcie-qcom.c
> > > > > > index 805edbbfe7eb..b2b89e2e4916 100644
> > > > > > --- a/drivers/pci/controller/dwc/pcie-qcom.c
> > > > > > +++ b/drivers/pci/controller/dwc/pcie-qcom.c
> > > > > > @@ -2018,6 +2018,7 @@ static int qcom_pcie_probe(struct platform_device *pdev)
> > > > > > static int qcom_pcie_suspend_noirq(struct device *dev)
> > > > > > {
> > > > > > struct qcom_pcie *pcie;
> > > > > > + struct dw_pcie_rp *pp;
> > > > > > int ret = 0;
> > > > > >
> > > > > > pcie = dev_get_drvdata(dev);
> > > > > > @@ -2053,8 +2054,9 @@ static int qcom_pcie_suspend_noirq(struct device *dev)
> > > > > > * powerdown state. This will affect the lifetime of the storage devices
> > > > > > * like NVMe.
> > > > > > */
> > > > > > - if (!dw_pcie_link_up(pcie->pci)) {
> > > > > > - qcom_pcie_host_deinit(&pcie->pci->pp);
> > > > > > + pp = &pcie->pci->pp;
> > > > > > + if (!pci_root_ports_have_device(pp->bridge->bus)) {
> > > > > > + qcom_pcie_host_deinit(pp);
> > > > > > pcie->suspended = true;
> > > > > > }
> > > > > >
> > > > > > --
> > > > > > 2.48.1
> > > > > >
> > > >
> > > > --
>

--
மணிவண்ணன் சதாசிவம்