RE: [PATCH] PCI: dwc: Added link up check in map_bus of dw_child_pcie_ops

From: Z.q. Hou
Date: Mon Sep 21 2020 - 12:31:34 EST


Hi Bjorn,

Thanks a lot for your comments!

> -----Original Message-----
> From: Bjorn Helgaas <helgaas@xxxxxxxxxx>
> Sent: 2020年9月18日 20:47
> To: Z.q. Hou <zhiqiang.hou@xxxxxxx>
> Cc: Rob Herring <robh@xxxxxxxxxx>; linux-kernel@xxxxxxxxxxxxxxx; PCI
> <linux-pci@xxxxxxxxxxxxxxx>; Lorenzo Pieralisi <lorenzo.pieralisi@xxxxxxx>;
> Bjorn Helgaas <bhelgaas@xxxxxxxxxx>; Gustavo Pimentel
> <gustavo.pimentel@xxxxxxxxxxxx>; Michael Walle <michael@xxxxxxxx>;
> Ard Biesheuvel <ardb@xxxxxxxxxx>
> Subject: Re: [PATCH] PCI: dwc: Added link up check in map_bus of
> dw_child_pcie_ops
>
> On Fri, Sep 18, 2020 at 11:02:07AM +0000, Z.q. Hou wrote:
>
> > But now the SError is exactly caused by the first access of the
> > non-existent function, I dug into the kernel enumeration code and
> > found it will fire a 4Byte CFG read transaction to read the Vendor ID
> > and Device ID together, so I suspect the root cause is access the
> > Device ID of a non-existent function triggers SError.
> >
> > So the alternative solution seems to correct the PCIe enumeration, I
> > will submit a patch to let the first access only read the Vendor ID.
>
> If it is incorrect for the first access to be a 32-bit read of both the Vendor
> and the Device ID, please cite the relevant section of the spec in your patch.
>
> I don't like to make changes to generic code to accommodate specific pieces
> of hardware because then we restrict future changes based on some device
> that will soon be obsolete and forgotten.
>
> I'm pretty sure the spec language about CRS handling is careful to talk about
> "reads that *include* Vendor ID", not just "reads of Vendor ID", so the
> implication is that it covers 32-bit reads as well as 16-bit reads.
>

Yes, I agree with you that we must be carful with the generic code.
NXP Layerscape SError aside, it turns out to be more complex, limiting the first CFG access to 16-bit Vendor ID also causes SError, the hardware behavior seems not the same as the described of the register PCIE_ABSERR.

For the PCIe enumeration, I found the descriptions of Vendor ID and Device ID in the PCI Express Base Specification, Rev. 4.0 Version 1.0 (pasted below), it recommends to read Vendor ID to determine the presentence of a function and use the Device ID (with Vendor ID and Revision ID) to determine the driver needed. But in section 2.3.2 Completion Handling Rules, it seems, as you said, not limit to 16-bit Vendor ID, so I want to hear your and Rob's suggestion on this change.

7.5.1.1.1 Vendor ID Register (Offset 00h)
The Vendor ID register is HwInit and the value in this register identifies the manufacturer of the
Function. In keeping with PCI-SIG procedures, valid vendor identifiers must be allocated by the
PCI-SIG to ensure uniqueness. Each vendor must have at least one Vendor ID. It is recommended
that software read the Vendor ID register to determine if a Function is present, where a value of
FFFFh indicates that no Function is present.
7.5.1.1.2 Device ID Register (Offset 02h)
The Device ID register is HwInit and the value in this register identifies the particular Function.
The Device ID must be allocated by the vendor. The Device ID, in conjunction with the Vendor ID
and Revision ID, are used as one mechanism for software to determine which driver should be
loaded. The vendor must ensure that the chosen values do not result in the use of an incompatible
device driver.

Regards,
Zhiqiang

> Bjorn