Re: [PATCH] PCI: qcom-ep: Do not enable resources during probe()

From: Bjorn Helgaas
Date: Wed Aug 21 2024 - 18:56:29 EST


On Wed, Aug 14, 2024 at 05:25:47AM +0900, Krzysztof Wilczyński wrote:
> > Starting from commit 869bc5253406 ("PCI: dwc: ep: Fix DBI access failure
> > for drivers requiring refclk from host"), all the hardware register access
> > (like DBI) were moved to dw_pcie_ep_init_registers() which gets called only
> > in qcom_pcie_perst_deassert() i.e., only after the endpoint received refclk
> > from host.
> >
> > So there is no need to enable the endpoint resources (like clk, regulators,
> > PHY) during probe(). Hence, remove the call to qcom_pcie_enable_resources()
> > helper from probe(). This was added earlier because dw_pcie_ep_init() was
> > doing DBI access, which is not done now.
> >
> > While at it, let's also call dw_pcie_ep_deinit() in err path to deinit the
> > EP controller in the case of failure.
>
> Applied to controller/qcom, thank you!
>
> [1/1] PCI: qcom-ep: Do not enable resources during probe()
> https://git.kernel.org/pci/pci/c/cd0b3e13ec30

I think we do need this, but I dropped it for now pending a commit log
that says "we're fixing a crash" and explains how.

The current log says "869bc5253406 moved hardware register access like
DBI to dw_pcie_ep_init_registers()", but 869bc5253406 actually moved a
bunch of register accesses from dw_pcie_ep_init() to
dw_pcie_ep_init_complete(), and a subsequent patch renamed
dw_pcie_ep_init_complete() to dw_pcie_ep_init_registers(). I did
eventually figure out the rename, but it took a while to make that
leap.

It also says dw_pcie_ep_init_registers() is called only from
qcom_pcie_perst_deassert(), but obviously all drivers call it. I
think what you meant is that on qcom and tegra194,
dw_pcie_ep_init_registers() isn't called from .probe(); it's called
later because they require refclk to access the registers, so qcom and
tegra194 call it after PERST# is deasserted, because then refclk is
available.

Trying to understand the 869bc5253406 reference: I guess the
point is that the dw_pcie_ep_init_registers() work depends on
qcom_pcie_enable_resources(), and before 869bc5253406, that work was
done by qcom_pcie_ep_probe() calling dw_pcie_ep_init(), so it had to
call qcom_pcie_enable_resources() first.

But after 869bc5253406, dw_pcie_ep_init_registers() is done in
qcom_pcie_perst_deassert(), which already calls
qcom_pcie_enable_resources(). So qcom_pcie_ep_probe() no longer needs
to call qcom_pcie_enable_resources().

As far as the *crash*, phy_power_on() has been called from
qcom_pcie_ep_probe() since the very beginning in f55fee56a631 ("PCI:
qcom-ep: Add Qualcomm PCIe Endpoint controller driver"). But
apparently on some new platforms phy_power_on() depends on refclk and
(I assume) it causes a crash when done from qcom_pcie_ep_probe().

So I would think the commit log should look something like this:

PCI: qcom-ep: Postpone PHY power-on until refclk available

qcom_pcie_enable_resources() is called by qcom_pcie_ep_probe() and
powers on PHYs. On new platforms like X, Y, Z, this depends on
refclk from the RC, which may not be available at the time of
qcom_pcie_ep_probe(), so this causes a crash in the qcom-ep driver.

qcom_pcie_enable_resources() is already called by
qcom_pcie_perst_deassert() when PERST# is deasserted, and refclk is
available at that time.

Remove the unnecessary call from qcom_pcie_ep_probe() to prevent the
crash on X, Y, Z.

Although I do have the question of what happens if the RC deasserts
PERST# before qcom-ep is loaded. We probably don't execute
qcom_pcie_perst_deassert() in that case, so how does the init happen?

Bjorn