Hi Bjorn,
On 5/30/2018 10:27 AM, Bjorn Helgaas wrote:
On Thu, May 17, 2018 at 10:21:31AM -0700, Ray Jui wrote:
On certain versions of Broadcom PAXC based root complexes, certain
regions of the configuration space are corrupted. As a result, it
prevents the Linux PCIe stack from traversing the linked list of the
capability registers completely and therefore the root complex is
not advertised as "PCIe capable". This prevents the correct PCIe RID
from being parsed in the kernel PCIe stack. A correct RID is required
for mapping to a stream ID from the SMMU or the device ID from the
GICv3 ITS
This patch fixes up the issue by manually populating the related
PCIe capabilities based on readings from the PCIe capability structure
Signed-off-by: Ray Jui <rjui@xxxxxxxxxxxx>
Reviewed-by: Anup Patel <anup.patel@xxxxxxxxxxxx>
Reviewed-by: Scott Branden <scott.branden@xxxxxxxxxxxx>
---
 drivers/pci/quirks.c | 95 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 95 insertions(+)
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index 47dfea0..0cdbd0a 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -2198,6 +2198,101 @@ DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_BROADCOM, 0x16f0, quirk_paxc_bridge);
 DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_BROADCOM, 0xd750, quirk_paxc_bridge);
 DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_BROADCOM, 0xd802, quirk_paxc_bridge);
 DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_BROADCOM, 0xd804, quirk_paxc_bridge);
+
+/*
+ * The PCI capabilities list for certain revisions of Broadcom PAXC root
+ * complexes is incorrectly terminated due to corrupted configuration space
+ * registers in the range of 0x50 - 0x5f
+ *
+ * As a result, the capability list becomes broken and prevent standard PCI
+ * stack from being able to traverse to the PCIe capability structure
+ */
+static void quirk_paxc_pcie_capability(struct pci_dev *pdev)
+{
+ÂÂÂ int pos, i = 0;
+ÂÂÂ u8 next_cap;
+ÂÂÂ u16 reg16, *cap;
+ÂÂÂ struct pci_cap_saved_state *state;
+
+ÂÂÂ /* bail out if PCIe capability can be found */
+ÂÂÂ if (pdev->pcie_cap || pci_find_capability(pdev, PCI_CAP_ID_EXP))
+ÂÂÂÂÂÂÂ return;
+
+ÂÂÂ /* locate the power management capability */
+ÂÂÂ pos = pci_find_capability(pdev, PCI_CAP_ID_PM);
+ÂÂÂ if (!pos)
+ÂÂÂÂÂÂÂ return;
+
+ÂÂÂ /* bail out if the next capability pointer is not 0x50/0x58 */
+ÂÂÂ pci_read_config_byte(pdev, pos + 1, &next_cap);
+ÂÂÂ if (next_cap != 0x50 && next_cap != 0x58)
+ÂÂÂÂÂÂÂ return;
+
+ÂÂÂ /* bail out if we do not terminate at 0x50/0x58 */
+ÂÂÂ pos = next_cap;
+ÂÂÂ pci_read_config_byte(pdev, pos + 1, &next_cap);
+ÂÂÂ if (next_cap != 0x00)
+ÂÂÂÂÂÂÂ return;
+
+ÂÂÂ /*
+ÂÂÂÂ * On these buggy HW, PCIe capability structure is expected to be at
+ÂÂÂÂ * 0xac and should terminate the list
+ÂÂÂÂ *
+ÂÂÂÂ * Borrow the similar logic from theIntel DH895xCC VFs fixup to save
+ÂÂÂÂ * the PCIe capability list
+ÂÂÂÂ */
+ÂÂÂ pos = 0xac;
+ÂÂÂ pci_read_config_word(pdev, pos, ®16);
+ÂÂÂ if (reg16 == (0x0000 | PCI_CAP_ID_EXP)) {
+ÂÂÂÂÂÂÂ u32 status;
+
+#ifndef PCI_EXP_SAVE_REGS
+#define PCI_EXP_SAVE_REGSÂÂÂÂ 7
+#endif
+ÂÂÂÂÂÂÂ int size = PCI_EXP_SAVE_REGS * sizeof(u16);
+
+ÂÂÂÂÂÂÂ pdev->pcie_cap = pos;
+ÂÂÂÂÂÂÂ pci_read_config_word(pdev, pos + PCI_EXP_FLAGS, ®16);
+ÂÂÂÂÂÂÂ pdev->pcie_flags_reg = reg16;
+ÂÂÂÂÂÂÂ pci_read_config_word(pdev, pos + PCI_EXP_DEVCAP, ®16);
+ÂÂÂÂÂÂÂ pdev->pcie_mpss = reg16 & PCI_EXP_DEVCAP_PAYLOAD;
Is there any way you can fix this in iproc_pcie_config_read() instead,
by making it notice when we're reading a corrupted part of config
space, and then returning the correct data instead? Is it just the
next capability pointer that's corrupted?
Let me look into that and I'll get back.
Thanks,
Ray
If you could fix it in the config accessor, lspci would automatically
show all the correct data (I think lspci will still show the wrong
data with this patch).
The quirk seems like a maintenance issue because anything that calls
ÂÂ pci_find_capability(pdev, PCI_CAP_ID_EXP)
will get the wrong answer.
+
+ÂÂÂÂÂÂÂ pdev->cfg_size = PCI_CFG_SPACE_EXP_SIZE;
+ÂÂÂÂÂÂÂ if (pci_read_config_dword(pdev, PCI_CFG_SPACE_SIZE, &status) !=
+ÂÂÂÂÂÂÂÂÂÂÂ PCIBIOS_SUCCESSFUL || (status == 0xffffffff))
+ÂÂÂÂÂÂÂÂÂÂÂ pdev->cfg_size = PCI_CFG_SPACE_SIZE;
+
+ÂÂÂÂÂÂÂ if (pci_find_saved_cap(pdev, PCI_CAP_ID_EXP))
+ÂÂÂÂÂÂÂÂÂÂÂ return;
+
+ÂÂÂÂÂÂÂ state = kzalloc(sizeof(*state) + size, GFP_KERNEL);
+ÂÂÂÂÂÂÂ if (!state)
+ÂÂÂÂÂÂÂÂÂÂÂ return;
+
+ÂÂÂÂÂÂÂ state->cap.cap_nr = PCI_CAP_ID_EXP;
+ÂÂÂÂÂÂÂ state->cap.cap_extended = 0;
+ÂÂÂÂÂÂÂ state->cap.size = size;
+ÂÂÂÂÂÂÂ cap = (u16 *)&state->cap.data[0];
+ÂÂÂÂÂÂÂ pcie_capability_read_word(pdev, PCI_EXP_DEVCTL, &cap[i++]);
+ÂÂÂÂÂÂÂ pcie_capability_read_word(pdev, PCI_EXP_LNKCTL, &cap[i++]);
+ÂÂÂÂÂÂÂ pcie_capability_read_word(pdev, PCI_EXP_SLTCTL, &cap[i++]);
+ pcie_capability_read_word(pdev, PCI_EXP_RTCTL, &cap[i++]);
+ÂÂÂÂÂÂÂ pcie_capability_read_word(pdev, PCI_EXP_DEVCTL2, &cap[i++]);
+ÂÂÂÂÂÂÂ pcie_capability_read_word(pdev, PCI_EXP_LNKCTL2, &cap[i++]);
+ÂÂÂÂÂÂÂ pcie_capability_read_word(pdev, PCI_EXP_SLTCTL2, &cap[i++]);
+ÂÂÂÂÂÂÂ hlist_add_head(&state->next, &pdev->saved_cap_space);
+ÂÂÂ }
+}
+DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_NX2_57810,
+ÂÂÂÂÂÂÂÂÂÂÂ quirk_paxc_pcie_capability);
+DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_BROADCOM, 0x16cd,
+ÂÂÂÂÂÂÂÂÂÂÂ quirk_paxc_pcie_capability);
+DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_BROADCOM, 0x16f0,
+ÂÂÂÂÂÂÂÂÂÂÂ quirk_paxc_pcie_capability);
+DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_BROADCOM, 0xd802,
+ÂÂÂÂÂÂÂÂÂÂÂ quirk_paxc_pcie_capability);
+DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_BROADCOM, 0xd804,
+ÂÂÂÂÂÂÂÂÂÂÂ quirk_paxc_pcie_capability);
 #endif
 /* Originally in EDAC sources for i82875P:
--
2.1.4