Re: [PATCH] arm64: dts: qcom: x1e80100: enable GICv3 ITS for PCIe

From: Manivannan Sadhasivam
Date: Fri Jul 12 2024 - 09:32:16 EST


On Fri, Jul 12, 2024 at 10:20:24AM +0200, Johan Hovold wrote:
> On Thu, Jul 11, 2024 at 06:59:22PM +0200, Johan Hovold wrote:
> > On Thu, Jul 11, 2024 at 10:11:53PM +0530, Manivannan Sadhasivam wrote:
> > > On Thu, Jul 11, 2024 at 09:49:52PM +0530, Manivannan Sadhasivam wrote:
>
> > > > My hunch is the PHY settings. But Abel cross checked the PHY settings with
> > > > internal documentation and they seem to match. Also, Qcom submitted a series
> > > > that is supposed to fix stability issues with Gen4 [1]. With this series, Gen 4
> > > > x4 setup is working on SA8775P-RIDE board as reported by Qcom. But Abel
> > > > confirmed that it didn't help him with the link downgrade issue.
> > > >
> > > > Perhaps you can give it a try and see if it makes any difference for
> > > > this issue?
> >
> > If there are known issues with running at Gen4 speed without that
> > series, then it seems quite likely that doing so anyway could also cause
> > correctable errors.
> >
> > Unfortunately, I get a hypervisor reset when I tried booting with that
> > series so there appears to be some implicit dependency on something
> > else (e.g. the 4l stuff).
>
> The first patch in that series breaks icc handling, which crashes
> machines like the X13s and the x1e80100 CRD on boot. I've just reported
> this here:
>
> https://lore.kernel.org/lkml/ZpDlf5xD035x2DqL@xxxxxxxxxxxxxxxxxxxx/
>

Ah, what a blinder... Thanks for reporting.

But I'm wondering why Abel was not seeing this crash when he tested this series
for 4L.

> With that fixed, and with the hacky dependency on having max-link-speed
> specified in the DT for the series to have any affect at all, the gen4
> stability series indeed seems to make the AER error go away (Abel just
> confirmed using a branch I'd prepared).
>

Cool, good to know.

> Let's try to get that series in shape and merged in some form as
> everyone will be hitting these Correctable Errors currently with the
> NVMe on x1e80100.
>

Sure. This series anyway needs respin due to the dependency with the OPP series
that just got merged. But merging it for 6.11 is quite unlikely.

- Mani

--
மணிவண்ணன் சதாசிவம்