Re: [BUG] Bisected Problem with LSI PCI FC Adapter

From: Bjorn Helgaas
Date: Sat Sep 13 2014 - 00:07:29 EST


I want to fix this regression before v3.17. Dirk, can you test the
following patch on top of v3.17-rc2? I'm hoping you can try this on your
test machine in conjunction with your acpi_pci_root_add() and
pci_scan_device() patches. If I understand correctly, you were able to
reproduce the FC adapter not showing up, and if you can verify that it does
show with those patches + this revert, I think that's good enough for now.

I'm not committed to applying this yet, but I'd like to have a working fix
in my back pocket in case we don't come up with a better solution soon.

Bjorn


commit 5945a8d28c416fc390a94c8e7fb8fd0a76f5d710
Author: Bjorn Helgaas <bhelgaas@xxxxxxxxxx>
Date: Fri Sep 12 21:58:19 2014 -0600

Revert "PCI: Make sure bus number resources stay within their parents bounds"

This reverts commit 1820ffdccb9b ("PCI: Make sure bus number resources stay
within their parents bounds") because it breaks some systems with LSI Logic
FC949ES Fibre Channel Adapters, apparently by exposing a defect in those
adapters.

Dirk tested a Tyan VX50 (B4985) with this device that worked like this
prior to 1820ffdccb9b:

bus: [bus 00-7f] on node 0 link 1
ACPI: PCI Root Bridge [PCI0] (domain 0000 [bus 00-07])
pci 0000:00:0e.0: PCI bridge to [bus 0a]
pci_bus 0000:0a: busn_res: can not insert [bus 0a] under [bus 00-07] (conflicts with (null) [bus 00-07])
pci 0000:0a:00.0: [1000:0646] type 00 class 0x0c0400 (FC adapter)

Note that the root bridge [bus 00-07] aperture is wrong; this is a BIOS
defect in the PCI0 _CRS method. But prior to 1820ffdccb9b, we didn't
enforce that aperture, and the FC adapter worked fine at 0a:00.0.

After 1820ffdccb9b, we notice that 00:0e.0's aperture is not contained in
the root bridge's aperture, so we reconfigure it so it *is* contained:

pci 0000:00:0e.0: bridge configuration invalid ([bus 0a-0a]), reconfiguring
pci 0000:00:0e.0: PCI bridge to [bus 06-07]

This effectively moves the FC device from 0a:00.0 to 07:00.0, which should
be legal. But when we enumerate bus 06, the FC device doesn't respond, so
we don't find anything. This is probably a defect in the FC device.

Possible fixes (due to Yinghai):

1) Add a quirk to fix the _CRS information based on what amd_bus.c read
from the hardware

2) Reset the FC device after we change its bus number

3) Revert 1820ffdccb9b

Fix 1 would be relatively easy, but it does sweep the LSI FC issue under
the rug. We might want to reconfigure bus numbers in the future for some
other reason, e.g., hotplug, and then we could trip over this again.

For that reason, I like fix 2, but we don't know whether it actually works,
and we don't have a patch for it yet.

This revert is fix 3, which also sweeps the LSI FC issue under the rug.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=84281
Reported-by: Dirk Gouders <dirk@xxxxxxxxxxx>
Signed-off-by: Bjorn Helgaas <bhelgaas@xxxxxxxxxx>
CC: stable@xxxxxxxxxxxxxxx # v3.15+
CC: Yinghai Lu <yinghai@xxxxxxxxxx>

diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index e3cf8a2e6292..f0badff77cff 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -775,7 +775,7 @@ int pci_scan_bridge(struct pci_bus *bus, struct pci_dev *dev, int max, int pass)
/* Check if setup is sensible at all */
if (!pass &&
(primary != bus->number || secondary <= bus->number ||
- secondary > subordinate || subordinate > bus->busn_res.end)) {
+ secondary > subordinate)) {
dev_info(&dev->dev, "bridge configuration invalid ([bus %02x-%02x]), reconfiguring\n",
secondary, subordinate);
broken = 1;
@@ -853,8 +853,7 @@ int pci_scan_bridge(struct pci_bus *bus, struct pci_dev *dev, int max, int pass)
child = pci_add_new_bus(bus, dev, max+1);
if (!child)
goto out;
- pci_bus_insert_busn_res(child, max+1,
- bus->busn_res.end);
+ pci_bus_insert_busn_res(child, max+1, 0xff);
}
max++;
buses = (buses & 0xff000000)
@@ -913,11 +912,6 @@ int pci_scan_bridge(struct pci_bus *bus, struct pci_dev *dev, int max, int pass)
/*
* Set the subordinate bus number to its real value.
*/
- if (max > bus->busn_res.end) {
- dev_warn(&dev->dev, "max busn %02x is outside %pR\n",
- max, &bus->busn_res);
- max = bus->busn_res.end;
- }
pci_bus_update_busn_res_end(child, max);
pci_write_config_byte(dev, PCI_SUBORDINATE_BUS, max);
}
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/