Re: [EXT] VPD access Blocked by commit 0d5370d1d85251e5893ab7c90a429464de2e140b

From: Himanshu Madhani
Date: Mon Jun 03 2019 - 18:36:37 EST


Hi Bjorn,

> On May 30, 2019, at 1:58 PM, Bjorn Helgaas <helgaas@xxxxxxxxxx> wrote:
>
> On Thu, May 30, 2019 at 07:33:01PM +0000, Himanshu Madhani wrote:
>
>> We are able to successfully read VPD config data using lspci and cat
>> command
>
> Yes, you mentioned that in the very first email. I was hoping you
> would include the actual data, e.g., "cat vpd | xxd". That would help
> us figure out why you don't see the panic any more. I suspect either:
>

Missed the request for xxd output. I got access back today for the system
and captured it for you

# cat /sys/class/pci_bus/0000\:13/device/0000\:13\:00.0/vpd | xxd
00000000: 822d 0051 4c6f 6769 6320 3332 4762 2032 .-.QLogic 32Gb 2
00000010: 2d70 6f72 7420 4643 2074 6f20 5043 4965 -port FC to PCIe
00000020: 2047 656e 3320 7838 2041 6461 7074 6572 Gen3 x8 Adapter
00000030: 9039 0050 4e07 514c 4532 3734 3253 4e0d .9.PN.QLE2742SN.
00000040: 4146 4431 3533 3359 3032 3939 3945 430f AFD1533Y02999EC.
00000050: 424b 3332 3130 3430 372d 3035 2030 3356 BK3210407-05 03V
00000060: 3906 3031 3031 3839 5256 01a0 78 9.010189RV..x


PCIe trace also confirmed there are no READ errors.
(if you need i can attach .pex file for review)


> - new QLogic firmware fixed the structure of the VPD data so Linux
> no longer attempts to read past the end of the implemented region,
> or,
>
> - we still read past the end of the implemented VPD region, but the
> device doesn't report an error or the platform deals with the
> error without causing a panic.
>
>> We also verified this same configuration on a SuperMicro X10SRA-F
>> server (which i had sent in earlier email)â and were able to verify
>> that the VPD read was good and there were no errors on PCIe trace.
>
> Since you saw no PCIe errors here, this suggests that new firmware has
> changed the format of the VPD data.
>
>> Given this information, Please consider reverting the patch until we
>> further debug the issue and resolve as it is affecting general
>> availability of our adapter.
>
> 1) The way Linux works is that you would post a patch that does the
> revert you'd like to see done.
>

Correct. I was trying get your buy-in before i send out patch.

> 2) It's unlikely that a simple revert of 0d5370d1d852 ("PCI: Prevent
> VPD access for QLogic ISP2722") is the right answer because that would
> make Ethan's machine panic again. It's possible that a QLogic
> firmware update would avoid the panic, but we can't simply revert the
> patch and force users to do that update.
>

I did reached out to Oracle to help locate original card where Ethan had
issue and i learned that he is no longer with Oracle.

> If a QLogic firmware update indeed fixed the VPD format, I suggest
> that you ask the folks responsible for the firmware to identify the
> specific version where that was fixed and how the OS can figure that
> out.
>

Still waiting on this data.

> Then you could make a new quirk specific to this device that allows
> VPD reads if the adapter has new enough firmware. If it finds older
> firmware, it could even print a message suggesting that users could
> update the firmware if they need to read VPD data.
>
> Bjorn

Since major OEMs are having issues using adapter to extract VPD data, We
would like to get them relief first and then approach this issue with more
detailed fix if needed.

Thanks,
Himanshu