Re: [PATCH] arm64: dts: qcom: x1e80100: enable GICv3 ITS for PCIe

From: Manivannan Sadhasivam
Date: Thu Jul 11 2024 - 12:20:02 EST


On Thu, Jul 11, 2024 at 05:01:15PM +0200, Johan Hovold wrote:
> [ +CC: Mani ]
>
> On Thu, Jul 11, 2024 at 11:58:08AM +0200, Johan Hovold wrote:
> > On Thu, Jul 11, 2024 at 11:54:15AM +0200, Konrad Dybcio wrote:
> > > On 11.07.2024 11:02 AM, Johan Hovold wrote:
> > > > The DWC PCIe controller can be used with its internal MSI controller or
> > > > with an external one such as the GICv3 Interrupt Translation Service
> > > > (ITS).
> > > >
> > > > Add the msi-map properties needed to use the GIC ITS. This will also
> > > > make Linux switch to the ITS implementation, which allows for assigning
> > > > affinity to individual MSIs.
>
> > > X1E CRD throws tons of correctable errors with this on PCIe6a:
>
> > What branch are you using? Abel reported seeing this with his branch
> > which has a few work-in-progress patches that try to enable 4-lane PCIe.
> >
> > There are no errors with my wip branch based on rc7, and I have the same
> > drive as Abel.
>
> For some reason I don't get these errors on my machine, but this has now
> been confirmed by two other people running my rc branch (including Abel)
> so something is broken here, for example, with the PHY settings.
>

I saw AER errors on Abel's machine during probe with 4-lane PHY settings. And
that might be the indication why the link width got downgraded to x2. This is
still not yet resolved.

> I saw five correctable errors once, when running linux-next, but it took
> several minutes and they were still minutes apart.
>
> > Also note that the errors happen also without this patch applied, they
> > are just being reported now.
>
> I guess we need to track down what is causing these errors before
> enabling ITS (and thereby the error reporting).
>
> At least L0s is not involved here, as it was with sc8280xp, as the
> NVMe controllers in question do not support it.
>
> Perhaps something is off because we're running the link at half width?
>

My hunch is the PHY settings. But Abel cross checked the PHY settings with
internal documentation and they seem to match. Also, Qcom submitted a series
that is supposed to fix stability issues with Gen4 [1]. With this series, Gen 4
x4 setup is working on SA8775P-RIDE board as reported by Qcom. But Abel
confirmed that it didn't help him with the link downgrade issue.

Perhaps you can give it a try and see if it makes any difference for this issue?

Meantime, I'm checking with Qcom contacts on this.

- Mani

[1] https://lore.kernel.org/linux-pci/20240320071527.13443-1-quic_schintav@xxxxxxxxxxx/

--
மணிவண்ணன் சதாசிவம்