Re: [PATCH v3] PCI: Sanitise firmware BAR assignments behind a PCI-PCI bridge

From: Bjorn Helgaas
Date: Wed Sep 21 2022 - 18:53:51 EST


On Wed, Sep 21, 2022 at 08:49:16PM +0100, Maciej W. Rozycki wrote:
> Fix an issue with the Tyan Tomcat IV S1564D system, the BIOS of which
> does not assign PCI buses beyond #2, where our resource reallocation
> code preserves the reset default of an I/O BAR assignment outside its
> upstream PCI-to-PCI bridge's I/O forwarding range:
>
> pci 0000:06:08.0: BAR 4: no space for [io size 0x0020]
> pci 0000:06:08.0: BAR 4: trying firmware assignment [io 0xfce0-0xfcff]
> pci 0000:06:08.0: BAR 4: assigned [io 0xfce0-0xfcff]
> [...]
> pci_bus 0000:06: resource 0 [io 0x2000-0x2fff]
>
> Consequently when the device driver tries to access 06:08.0 according to
> its designated address range it pokes at an unassigned I/O location,
> likely subtractively decoded by the southbridge and forwarded to ISA,
> causing the driver to become confused and bail out:
>
> uhci_hcd 0000:06:08.0: host system error, PCI problems?
> uhci_hcd 0000:06:08.0: host controller process error, something bad happened!
> uhci_hcd 0000:06:08.0: host controller halted, very bad!
> uhci_hcd 0000:06:08.0: HCRESET not completed yet!
> uhci_hcd 0000:06:08.0: HC died; cleaning up
>
> if good luck happens or if bad luck does, an infinite flood of messages:
>
> uhci_hcd 0000:06:08.0: host system error, PCI problems?
> uhci_hcd 0000:06:08.0: host controller process error, something bad happened!
>
> making the system virtually unusable.
>
> This is because we try to retain any BAR assignment the firmware may
> have made here, which may be necessary for devices on the root bus with
> some systems, but cannot work for devices that are behind a PCI-to-PCI
> bridge where the BAR assignment is outside the upstream bridge's
> forwarding range.
>
> Make sure then for a device behind a PCI-to-PCI bridge that any firmware
> assignment is within the bridge's relevant forwarding window or do not
> restore the assignment, fixing the system concerned as follows:
>
> pci 0000:06:08.0: BAR 4: no space for [io size 0x0020]
> pci 0000:06:08.0: BAR 4: failed to assign [io 0xfce0-0xfcff]
> [...]
> pci 0000:06:08.0: BAR 4: assigned [io 0x2000-0x201f]
>
> and making device 06:08.0 work correctly.
>
> Cf. <https://bugzilla.kernel.org/show_bug.cgi?id=16263>
>
> Signed-off-by: Maciej W. Rozycki <macro@xxxxxxxxxxx>
> Link: https://lore.kernel.org/r/alpine.DEB.2.21.2203012338460.46819@xxxxxxxxxxxxxxxxx
> Fixes: 58c84eda0756 ("PCI: fall back to original BIOS BAR addresses")
> Cc: stable@xxxxxxxxxxxxxxx # v2.6.35+
> ---
> Hi Bjorn,
>
> I have trimmed the change description down as you requested and left the
> change proper unmodified, as discussed in my earlier response.

I think this is great. It shouldn't have taken me this long, so
thanks for persevering.

I think we can use pci_upstream_bridge() as below. Let me know if
not.

Here it is as I applied to pci/resource for v6.1:

commit 0e3281839742 ("PCI: Sanitise firmware BAR assignments behind a PCI-PCI bridge")
Author: Maciej W. Rozycki <macro@xxxxxxxxxxx>
Date: Wed Sep 21 20:49:16 2022 +0100

PCI: Sanitise firmware BAR assignments behind a PCI-PCI bridge

When pci_assign_resource() is unable to assign resources to a BAR, it uses
pci_revert_fw_address() to fall back to a firmware assignment (if any).
Previously pci_revert_fw_address() assumed all addresses could reach the
device, but this is not true if the device is below a bridge that only
forwards addresses within its windows.

This problem was observed on a Tyan Tomcat IV S1564D system where the BIOS
did not assign valid addresses to several bridges and USB devices:

pci 0000:00:11.0: PCI-to-PCIe bridge to [bus 01-ff]
pci 0000:00:11.0: bridge window [io 0xe000-0xefff]
pci 0000:01:00.0: PCIe Upstream Port to [bus 02-ff]
pci 0000:01:00.0: bridge window [io 0x0000-0x0fff] # unreachable
pci 0000:02:02.0: PCIe Downstream Port to [bus 05-ff]
pci 0000:02:02.0: bridge window [io 0x0000-0x0fff] # unreachable
pci 0000:05:00.0: PCIe-to-PCI bridge to [bus 06-ff]
pci 0000:05:00.0: bridge window [io 0x0000-0x0fff] # unreachable
pci 0000:06:08.0: USB UHCI 1.1
pci 0000:06:08.0: BAR 4: [io 0xfce0-0xfcff] # unreachable
pci 0000:06:08.1: USB UHCI 1.1
pci 0000:06:08.1: BAR 4: [io 0xfce0-0xfcff] # unreachable
pci 0000:06:08.0: can't claim BAR 4 [io 0xfce0-0xfcff]: no compatible bridge window
pci 0000:06:08.1: can't claim BAR 4 [io 0xfce0-0xfcff]: no compatible bridge window

During the first pass of assigning unassigned resources, there was not
enough I/O space available, so we couldn't assign the 06:08.0 BAR and
reverted to the firmware assignment (still unreachable). Reverting the
06:08.1 assignment failed because it conflicted with 06:08.0:

pci 0000:00:11.0: bridge window [io 0xe000-0xefff]
pci 0000:01:00.0: no space for bridge window [io size 0x2000]
pci 0000:02:02.0: no space for bridge window [io size 0x1000]
pci 0000:05:00.0: no space for bridge window [io size 0x1000]
pci 0000:06:08.0: BAR 4: no space for [io size 0x0020]
pci 0000:06:08.0: BAR 4: trying firmware assignment [io 0xfce0-0xfcff]
pci 0000:06:08.1: BAR 4: no space for [io size 0x0020]
pci 0000:06:08.1: BAR 4: trying firmware assignment [io 0xfce0-0xfcff]
pci 0000:06:08.1: BAR 4: [io 0xfce0-0xfcff] conflicts with 0000:06:08.0 [io 0xfce0-0xfcff]

A subsequent pass assigned valid bridge windows and a valid 06:08.1 BAR,
but left the 06:08.0 BAR alone, so the UHCI device was still unusable:

pci 0000:00:11.0: bridge window [io 0xe000-0xefff] released
pci 0000:00:11.0: bridge window [io 0x1000-0x2fff] # reassigned
pci 0000:01:00.0: bridge window [io 0x1000-0x2fff] # reassigned
pci 0000:02:02.0: bridge window [io 0x2000-0x2fff] # reassigned
pci 0000:05:00.0: bridge window [io 0x2000-0x2fff] # reassigned
pci 0000:06:08.0: BAR 4: assigned [io 0xfce0-0xfcff] # left alone
pci 0000:06:08.1: BAR 4: assigned [io 0x2000-0x201f]
...
uhci_hcd 0000:06:08.0: host system error, PCI problems?
uhci_hcd 0000:06:08.0: host controller process error, something bad happened!
uhci_hcd 0000:06:08.0: host controller halted, very bad!
uhci_hcd 0000:06:08.0: HCRESET not completed yet!
uhci_hcd 0000:06:08.0: HC died; cleaning up

If the address assigned by firmware is not reachable because it's not
within upstream bridge windows, fail instead of assigning the unusable
address from firmware.

[bhelgaas: commit log, use pci_upstream_bridge()]
Link: https://bugzilla.kernel.org/show_bug.cgi?id=16263
Link: https://lore.kernel.org/r/alpine.DEB.2.21.2203012338460.46819@xxxxxxxxxxxxxxxxx
Link: https://lore.kernel.org/r/alpine.DEB.2.21.2209211921250.29493@xxxxxxxxxxxxxxxxx
Fixes: 58c84eda0756 ("PCI: fall back to original BIOS BAR addresses")
Signed-off-by: Maciej W. Rozycki <macro@xxxxxxxxxxx>
Signed-off-by: Bjorn Helgaas <bhelgaas@xxxxxxxxxx>
Cc: stable@xxxxxxxxxxxxxxx # v2.6.35+

diff --git a/drivers/pci/setup-res.c b/drivers/pci/setup-res.c
index 439ac5f5907a..b492e67c3d87 100644
--- a/drivers/pci/setup-res.c
+++ b/drivers/pci/setup-res.c
@@ -214,6 +214,17 @@ static int pci_revert_fw_address(struct resource *res, struct pci_dev *dev,

root = pci_find_parent_resource(dev, res);
if (!root) {
+ /*
+ * If dev is behind a bridge, accesses will only reach it
+ * if res is inside the relevant bridge window.
+ */
+ if (pci_upstream_bridge(dev))
+ return -ENXIO;
+
+ /*
+ * On the root bus, assume the host bridge will forward
+ * everything.
+ */
if (res->flags & IORESOURCE_IO)
root = &ioport_resource;
else