Re: [PATCH v2] PCI: qcom: fix IPQ8074 Gen2 support

From: Bjorn Helgaas
Date: Tue Jun 21 2022 - 17:43:52 EST


On Tue, Jun 21, 2022 at 11:05:17PM +0200, Robert Marko wrote:
> On Tue, 21 Jun 2022 at 22:32, Bjorn Helgaas <helgaas@xxxxxxxxxx> wrote:
> > On Tue, Jun 21, 2022 at 01:23:30PM +0200, Robert Marko wrote:
> > > IPQ8074 has one Gen2 and one Gen3 port, currently the Gen2 port will
> > > cause the system to hang as its using DBI registers in the .init
> > > and those are only accesible after phy_power_on().
> >
> > Is the fact that IPQ8074 has both a Gen2 and a Gen3 port relevant to
> > this patch? I don't see the connection.
>
> Not really, I can remove it from the description as this only affects the Gen2
> port, Gen3 support is dependant on the IPQ6018 support getting merged as
> it uses the same controller.

Great, I think unrelated details confuse things.

> > I see that qcom_pcie_host_init() does:
> >
> > qcom_pcie_host_init
> > pcie->cfg->ops->init(pcie)
> > phy_power_on(pcie->phy)
> > pcie->cfg->ops->post_init(pcie)
> >
> > and that you're moving DBI register accesses from
> > qcom_pcie_init_2_3_3() to qcom_pcie_post_init_2_3_3().
> >
> > But I also see DBI register accesses in other .init() functions:
> >
> > qcom_pcie_init_2_1_0
> > qcom_pcie_init_1_0_0 (oddly out of order)
> > qcom_pcie_init_2_3_2
> > qcom_pcie_init_2_4_0
> >
> > Why do these accesses not need to be moved? I assume it's because
> > pcie->phy is an optional PHY and phy_power_on() does nothing on those
> > controllers?
>
> As far as I could figure out from QCA-s 5.4 kernel, various commits,
> and QCA-s attempts to solve this already upstream the Gen2
> controller in IPQ8074 is a bit special and requires the PHY to be
> powered on before DBI could be accessed or else the board will hang
> as it does for me.
>
> I can only assume that this is an IPQ8074-specific quirk and other
> IP-s are not affected like this, so they were not broken.

> > Whatever the reason, I think the DBI accesses should be done
> > consistently in .post_init(). I see that Dmitry's previous patches
> > removed all those .post_init() functions, but I think the consistency
> > is worth having.
> >
> > Perhaps we could reorder the patches so this patch comes first, moves
> > the DBI accesses into .post_init(), then Dmitry's patches could be
> > rebased on top to drop the clock handling?
> >
> > > So solve this by splitting the DBI read/writes to .post_init.
>
> I am open to anything to get this fixed properly, you are gonna need
> to point me in the right direction as I am really new to PCI.

Unless there's a reason *not* to move the DBI accesses for other
variants, I think we should move them all to .post_init(). Otherwise
it's just magic -- there's no indication in the code about why IPQ8074
needs to be different from all the rest.

a0fd361db8e5 appeared in v5.11, so presumably IPQ8074 has been broken
since then? Are there any problem report URLs or references to the
attempts you mentioned above that we could include here?

Since folks may want to backport the fix to v5.11, I'd probably do
this patch by itself to make the backport easier, then add a second
patch to move the DBI accesses for all the other variants.

My personal opinion is that you should do this based on v5.19-rc1, and
then we can rebase Dmitry's patches on top. That would make all the
patches simpler and it would make yours easier to backport. But
that's the sort of thing Dmitry, Stanimir, Andy, and Bjorn A. could
weigh in on.

Bjorn H.