Re: [PATCH v6 2/5] PCI: qcom: Add retry logic for link to be stable in L1ss

From: Manivannan Sadhasivam
Date: Wed Sep 14 2022 - 01:59:18 EST


On Wed, Sep 14, 2022 at 07:15:35AM +0530, Krishna Chaitanya Chundru wrote:
>
> On 9/13/2022 10:09 PM, Manivannan Sadhasivam wrote:
> > On Tue, Sep 13, 2022 at 07:54:22PM +0530, Krishna Chaitanya Chundru wrote:
> > > On 9/12/2022 11:03 PM, Manivannan Sadhasivam wrote:
> > > > On Mon, Sep 12, 2022 at 09:39:36PM +0530, Krishna Chaitanya Chundru wrote:
> > > > > On 9/10/2022 1:20 AM, Bjorn Helgaas wrote:
> > > > > > On Fri, Sep 09, 2022 at 02:14:41PM +0530, Krishna chaitanya chundru wrote:
> > > > > > > Some specific devices are taking time to settle the link in L1ss.
> > > > > > > So added a retry logic before returning from the suspend op.
> > > > > > "L1ss" is not a state. If you mean "L1.1" or "L1.2", say that. Also
> > > > > > in code comments below.
> > > > > Yes L1ss means L1.2 and L1.2 We will update it next patch
> > > > > > s/So added a/Add/
> > > > > >
> > > > > > What are these specific devices? Is this a qcom controller defect?
> > > > > > An endpoint defect that should be addressed via some kind of generic
> > > > > > quirk?
> > > > > This is depending up on the endpoint devices and it varies to device to
> > > > > device.
> > > > >
> > > > Can we identify the source of the traffic? Is the NVMe driver not
> > > > flushing it's queues correctly?
> > > We found that it is not from nvme data, we are seeing some physical layer
> > > activity on the
> > >
> > > protocol analyzer.
> > >
> > Okay
> >
> > > > > We are thinking this is not a defect if there is some traffic in the link
> > > > > the link will
> > > > >
> > > > > not go to L1ss .
> > > > >
> > > > Is this hack still required even after switching to syscore ops?
> > > >
> > > > Thanks,
> > > > Mani
> > > Yes, mani it is still required. And just before this sycore ops there will
> > > be a pci transaction to
> > >
> > > mask msix interrupts.
> > >
> > Hmm. I'm getting slightly confused here. What really happens when you do
> > the resource teardown during suspend and the link has not entered L1SS?
> >
> > Since PHY is powered by MX domain, I'm wondering why we should wait for
> > the link to be in L1SS?
> >
> > Thanks,
> > Mani
>
> Mani, we need to turn off the link only after link entered in to L1ss. If we
> do before that
>
> some transactions will be disturbed and we see a link down.
>
> Mx power rail will control digital logic of the PHY and tries to retain the
> link state only,
>
> The analog logic is controlled by the CX rail only, so when the link is in
> L1ss only we turn off
>
> clks and phy.
>

Okay, thanks for the clarification. Please add this info as a comment just above
the change.

Thanks,
Mani

> > > > > > > Signed-off-by: Krishna chaitanya chundru <quic_krichai@xxxxxxxxxxx>
> > > > > > > ---
> > > > > > > drivers/pci/controller/dwc/pcie-qcom.c | 36 +++++++++++++++++++++++-----------
> > > > > > > 1 file changed, 25 insertions(+), 11 deletions(-)
> > > > > > >
> > > > > > > diff --git a/drivers/pci/controller/dwc/pcie-qcom.c b/drivers/pci/controller/dwc/pcie-qcom.c
> > > > > > > index 6e04d0d..15c2067 100644
> > > > > > > --- a/drivers/pci/controller/dwc/pcie-qcom.c
> > > > > > > +++ b/drivers/pci/controller/dwc/pcie-qcom.c
> > > > > > > @@ -1809,26 +1809,40 @@ static int qcom_pcie_probe(struct platform_device *pdev)
> > > > > > > static int __maybe_unused qcom_pcie_pm_suspend(struct qcom_pcie *pcie)
> > > > > > > {
> > > > > > > u32 val;
> > > > > > > + ktime_t timeout, start;
> > > > > > > struct dw_pcie *pci = pcie->pci;
> > > > > > > struct device *dev = pci->dev;
> > > > > > > if (!pcie->cfg->supports_system_suspend)
> > > > > > > return 0;
> > > > > > > - /* if the link is not active turn off clocks */
> > > > > > > - if (!dw_pcie_link_up(pci)) {
> > > > > > > - dev_info(dev, "Link is not active\n");
> > > > > > > - goto suspend;
> > > > > > > - }
> > > > > > > + start = ktime_get();
> > > > > > > + /* Wait max 200 ms */
> > > > > > > + timeout = ktime_add_ms(start, 200);
> > > > > > > - /* if the link is not in l1ss don't turn off clocks */
> > > > > > > - val = readl(pcie->parf + PCIE20_PARF_PM_STTS);
> > > > > > > - if (!(val & PCIE20_PARF_PM_STTS_LINKST_IN_L1SUB)) {
> > > > > > > - dev_warn(dev, "Link is not in L1ss\n");
> > > > > > > - return 0;
> > > > > > > + while (1) {
> > > > > > > +
> > > > > > > + if (!dw_pcie_link_up(pci)) {
> > > > > > > + dev_warn(dev, "Link is not active\n");
> > > > > > > + break;
> > > > > > > + }
> > > > > > > +
> > > > > > > + /* if the link is not in l1ss don't turn off clocks */
> > > > > > > + val = readl(pcie->parf + PCIE20_PARF_PM_STTS);
> > > > > > > + if ((val & PCIE20_PARF_PM_STTS_LINKST_IN_L1SUB)) {
> > > > > > > + dev_dbg(dev, "Link enters L1ss after %d ms\n",
> > > > > > > + ktime_to_ms(ktime_get() - start));
> > > > > > > + break;
> > > > > > > + }
> > > > > > > +
> > > > > > > + if (ktime_after(ktime_get(), timeout)) {
> > > > > > > + dev_warn(dev, "Link is not in L1ss\n");
> > > > > > > + return 0;
> > > > > > > + }
> > > > > > > +
> > > > > > > + udelay(1000);
> > > > > > > }
> > > > > > > -suspend:
> > > > > > > if (pcie->cfg->ops->suspend)
> > > > > > > pcie->cfg->ops->suspend(pcie);
> > > > > > > --
> > > > > > > 2.7.4
> > > > > > >