Re: [PATCH V4 0/3] PCI: designware-ep: Fix DBI access before core init

From: Vidya Sagar
Date: Mon Oct 03 2022 - 07:19:09 EST


Hi Bjorn,
Did you find time to take a look at my responses?
If you don't have anything to add further, I'll take care of the review comments as mentioned and send the V5 patch for review.
Please let me know.

Thanks,
Vidya Sagar

On 9/26/2022 8:32 PM, Vidya Sagar wrote:


On 9/20/2022 4:10 AM, Bjorn Helgaas wrote:
External email: Use caution opening links or attachments


On Tue, Sep 20, 2022 at 12:03:39AM +0530, Vidya Sagar wrote:
This series attempts to fix the issue with core register (Ex:- DBI) accesses
causing system hang issues in platforms where there is a dependency on the
availability of PCIe Reference clock from the host for their core
initialization.
This series is verified on Tegra194 & Tegra234 platforms.

I think this design is just kind of weird, specifically, the fact that
setting .core_init_notifier makes dw_pcie_ep_init() bail out early.
The usual pattern is more like "if the specific driver sets this
function pointer, the generic code calls it."

Thanks for the review Bjorn.

Typically the PCIe endpoints run using the reference clock from the hosts that they are connected to. Our hardware designers followed the same approach here as well, but the main difference here being that the controllers operating in the endpoint mode are not standalone controllers but part of a bigger Tegra (/Qcom) systems.
So, the complete controller initialization sequence just can't happen during the boot stage itself, hence the boot initialization sequence needs to be split into two parts viz a) early initialization - that just parses DT, does the programming that doesn't depend on the reference clock from host and b) does the programming that can only be performed after reference clock is available from the host
We are working with our hardware designers to avoid this dependency on the reference clock from the host so that all the programming can happen during boot itself and hardware is smart enough to switch to using the reference clock from the host when it is available. But, this is for future designs and Tegra194 & Tegra234 continue to have this limitation.


The name "dw_pcie_ep_init_complete()" is not as helpful as it could
be: it tells us something about what has happened before this point,
but it doesn't tell us anything about what dw_pcie_ep_init_complete()
*does*.

To be inline with new ops ep_init_late that I added in this series, would it be fine to name this as dw_pcie_ep_init_late()?


Same thing with dw_pcie_ep_init_notify() -- it doesn't tell us
anything about what the function *does*.

Would it make more sense to rename it as dw_pcie_ep_linkup_notify()?

  I see that it calls
pci_epc_init_notify(), which calls a notifier call chain (currently
always empty except for a test case).  I think pci_epc_linkup() is a
better name because it says something about what's happening: the link
is now up and we're telling somebody about it.  "pci_epc_init_notify()"
doesn't convey that.  "pci_epc_core_initialized()" might.

Ok. I'll rename it to pci_epc_core_initialized().


It looks like both qcom and tegra wait for an interrupt before calling
dw_pcie_ep_init_notify(), but I'm a little concerned because I can't
figure out what specifically they do to start the process that
ultimately generates the interrupt.

As part of 'start'ing the endpoint as mentioned in https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/Documentation/PCI/endpoint/pci-test-howto.rst#n101
we execute the following
echo 1 > controllers/141a0000.pcie-ep/start
that enables the interrupt generation for toggles on the PERST# line.

  Presumably they request the IRQ
*before* starting the process, but there's not much between the
devm_request_threaded_irq() and the interrupt handler, which makes me
wonder if both are racy.

I don't think there is any race between these two as the 'start' is initiated from the user space. Not sure if I'm missing something here though.


Manivannan, could you please verify on qcom platforms?

V4:
* Addressed review comments from Bjorn and Manivannan
* Added .ep_init_late() ops
* Added patches to refactor code in qcom and tegra platforms

Vidya Sagar (3):
   PCI: designware-ep: Fix DBI access before core init
   PCI: qcom-ep: Refactor EP initialization completion
   PCI: tegra194: Refactor EP initialization completion

  .../pci/controller/dwc/pcie-designware-ep.c   | 112 ++++++++++--------
  drivers/pci/controller/dwc/pcie-designware.h  |  10 +-
  drivers/pci/controller/dwc/pcie-qcom-ep.c     |  27 +++--
  drivers/pci/controller/dwc/pcie-tegra194.c    |   4 +-
  4 files changed, 85 insertions(+), 68 deletions(-)

--
2.17.1