Re: [PATCH] PCI: keystone: Don't enable BAR0 if link is not detected

From: Bjorn Helgaas
Date: Thu Oct 12 2023 - 12:43:45 EST


On Thu, Oct 12, 2023 at 10:15:09AM +0530, Siddharth Vadapalli wrote:
> Hello Bjorn,
>
> Thank you for reviewing the patch.
>
> On 11/10/23 19:16, Bjorn Helgaas wrote:
> > Hi Siddharth,
> >
> > On Wed, Oct 11, 2023 at 06:04:51PM +0530, Siddharth Vadapalli wrote:
> >> Since the function dw_pcie_host_init() ignores the absence of link under
> >> the assumption that link can come up later, it is possible that the
> >> pci_host_probe(bridge) function is invoked even when no endpoint device
> >> is connected. In such a situation, the ks_pcie_v3_65_add_bus() function
> >> configures BAR0 when the link is not up, resulting in Completion Timeouts
> >> during the MSI configuration performed later by the PCI Express Port driver
> >> to setup AER, PME and other services. Thus, leave BAR0 disabled if link is
> >> not yet detected when the ks_pcie_v3_65_add_bus() function is invoked.
> >
> > I'm trying to make sense of this. In this path:
> >
> > pci_host_probe
> > pci_scan_root_bus_bridge
> > pci_register_host_bridge
> > bus = pci_alloc_bus(NULL) # root bus
> > bus->ops->add_bus(bus)
> > ks_pcie_v3_65_add_bus
> >
> > The BAR0 in question must belong to a Root Port. And it sounds like
> > the issue must be related to MSI-X, since the original MSI doesn't
> > involve any BARs.
>
> Yes, the issue is related to MSI-X. I will list down the exact set of function
> calls below as well as the place where the completion timeout first occurs:
> ks_pcie_probe
> dw_pcie_host_init
> pci_host_probe
> pci_bus_add_devices
> pci_bus_add_device
> device_attach
> __device_attach
> bus_for_each_drv
> __device_attach_driver (invoked using fn(drv, data))
> driver_probe_device
> __driver_probe_device
> really_probe
> pci_device_probe
> pcie_portdrv_probe
> pcie_port_device_register
> pcie_init_service_irqs
> pcie_port_enable_irq_vec
> pci_alloc_irq_vectors
> pci_alloc_irq_vectors_affinity
> __pci_enable_msix_range
> msix_capability_init
> msix_setup_interrupts
> msix_setup_msi_descs
> msix_prepare_msi_desc
> In this function: msix_prepare_msi_desc, the following readl()
> causes completion timeout:
> desc->pci.msix_ctrl = readl(addr + PCI_MSIX_ENTRY_VECTOR_CTRL);
> The completion timeout with the readl is only observed when the link
> is down (No Endpoint device is actually connected to the PCIe
> connector slot).

Do you know the address ("addr")? From pci_msix_desc_addr(), it looks
like it should be:

desc->pci.mask_base + desc->msi_index * PCI_MSIX_ENTRY_SIZE

and desc->pci.mask_base should be dev->msix_base, which we got from
msix_map_region(), which ioremaps part of the BAR indicated by the
MSI-X Table Offset/Table BIR register.

I wonder if this readl() is being handled as an MMIO access to a
downstream device instead of a Root Port BAR access because it's
inside the Root Port's MMIO window.

Could you dump out these values just before the readl()?

phys_addr inside msix_map_region()
dev->msix_base
desc->pci.mask_base
desc->msi_index
addr
call early_dump_pci_device() on the Root Port

Bjorn