[REGRESSION, bisect] pci: cxgb4 probe fails after commit 104daa71b3961434 ("PCI: Determine actual VPD size on first access")

From: Hariprasad Shenai
Date: Tue Apr 12 2016 - 01:29:56 EST


Hi All,


The following patch introduced a regression, causing cxgb4 driver to fail in
PCIe probe.

commit 104daa71b39614343929e1982170d5fcb0569bb5
Author: Hannes Reinecke <hare@xxxxxxx>
Author: Hannes Reinecke <hare@xxxxxxx>
Date: Mon Feb 15 09:42:01 2016 +0100

PCI: Determine actual VPD size on first access

PCI-2.2 VPD entries have a maximum size of 32k, but might actually be
smaller than that. To figure out the actual size one has to read the VPD
area until the 'end marker' is reached.

Per spec, reading outside of the VPD space is "not allowed." In practice,
it may cause simple read errors or even crash the card. To make matters
worse not every PCI card implements this properly, leaving us with no 'end'
marker or even completely invalid data.

Try to determine the size of the VPD data when it's first accessed. If no
valid data can be read an I/O error will be returned when reading or
writing the sysfs attribute.

As the amount of VPD data is unknown initially the size of the sysfs
attribute will always be set to '0'.

[bhelgaas: changelog, use 0/1 (not false/true) for bitfield, tweak
pci_vpd_pci22_read() error checking]
Tested-by: Shane Seymour <shane.seymour@xxxxxxx>
Tested-by: Babu Moger <babu.moger@xxxxxxxxxx>
Signed-off-by: Hannes Reinecke <hare@xxxxxxx>
Signed-off-by: Bjorn Helgaas <bhelgaas@xxxxxxxxxx>
Cc: Alexander Duyck <alexander.duyck@xxxxxxxxx>

The problem is stemming from the fact that the Chelsio adapters actually have
two VPD structures stored in the VPD. An abbreviated on at Offset 0x0 and the
complete VPD at Offset 0x400. The abbreviated one only contains the PN, SN and
EC Keywords, while the complete VPD contains those plus various adapter
constants contained in V0, V1, etc. And it also contains the Base Ethernet MAC
Address in the "NA" Keyword which the cxgb4 driver needs when it can't contact
the adapter firmware. (We don't have the "NA" Keywork in the VPD Structure at
Offset 0x0 because that's not an allowed VPD Keyword in the PCI-E 3.0
specification.)

With the new code, the computed size of the VPD is 0x200 and so our efforts
to read the VPD at Offset 0x400 silently fails. We check the result of the
read looking for a signature 0x82 byte but we're checking against random stack
garbage.

The end result is that the cxgb4 driver now fails the PCI-E Probe.

Thanks,
Hari