Re: [PATCH v2] pci/probe: Enable CRS for root port if it is supported

From: Rajat Jain
Date: Tue Sep 09 2014 - 14:35:15 EST


On Mon, Sep 8, 2014 at 10:38 PM, Bjorn Helgaas <bhelgaas@xxxxxxxxxx> wrote:
> On Tue, Sep 02, 2014 at 04:26:00PM -0700, Rajat Jain wrote:
>>
>> As per the PCIe spec, an endpoint may return the configuration cycles
>> with CRS if it is not yet fully ready to be accessed. If the CRS visibility
>> is not enabled at the root port, the spec leaves the retry behaviour open
>> to implementation in such a case. The Intel root ports have chosen to retry
>> endlessly in this situation. Thus, the root controller may "hang" (repeatedly
>> retrying the configuration requests until it gets a status other than CRS) if
>> a device returns CRS for a long time. This can cause a broken endpoint to bring
>> down the whole PCI hierarchy.
>>
>> This was recently known to cause problems on Intel systems and
>> was discussed here:
>> http://marc.info/?t=140926298500002&r=1&w=2
>>
>> Ref1:
>> https://www.pcisig.com/specifications/pciexpress/ECN_CRS_Software_Visibility_No27.pdf
>>
>> Ref2:
>> PCIe spec V3.0, pg119, pg127 for "Configuration Request Retry Status"
>>
>> Thus enable the CRS visibility for the root ports that support it. This
>> patch reverts the following commit, but enables CRS visibility only
>> when the root port supports it:
>>
>> ad7edfe04908 ("[PCI] Do not enable CRS Software Visibility by default")
>>
>> (Linus' response: http://marc.info/?l=linux-pci&m=140968622520192&w=2)
>>
>> Signed-off-by: Rajat Jain <rajatxjain@xxxxxxxxx>
>> Signed-off-by: Rajat Jain <rajatjain@xxxxxxxxxxx>
>> Signed-off-by: Guenter Roeck <groeck@xxxxxxxxxxx>
>
> I put this and the "only look at Vendor ID" patch on a pci/enumeration
> branch [1]. I rewrote the changelogs to reflect my understanding of what's
> going on, so probably the real truth is somewhere between your original and
> mine. Let me know what should be fixed.
>
> We should figure out an easy way for Josh to test these. Ideally, he could
> test the second patch by itself first, then both together.

I just provided the 2 images to Josh:

1) An image with both the patches.
2) An image with just the patch 2 i.e. "Enable CRS visibility" patch.

Lets hope that results tally with our expectations.

Thanks,

Rajat


>
> [1] https://git.kernel.org/cgit/linux/kernel/git/helgaas/pci.git/log/?h=pci/enumeration
>
>> ---
>> v2: Remove the white list, that was enabling the CRS for certain known Intel systems.
>> Rather now enable it for all systems that support this capability.
>> v1: Enable CRS for only some given Intel systems (maintain a whitelist)
>>
>> drivers/pci/probe.c | 13 +++++++++++++
>> include/uapi/linux/pci_regs.h | 1 +
>> 2 files changed, 14 insertions(+)
>>
>> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
>> index e3cf8a2..3c4c35c 100644
>> --- a/drivers/pci/probe.c
>> +++ b/drivers/pci/probe.c
>> @@ -740,6 +740,17 @@ struct pci_bus *pci_add_new_bus(struct pci_bus *parent, struct pci_dev *dev,
>> }
>> EXPORT_SYMBOL(pci_add_new_bus);
>>
>> +static void pci_enable_crs(struct pci_dev *pdev)
>> +{
>> + u16 root_cap = 0;
>> +
>> + /* Enable CRS Software visibility if supported */
>> + pcie_capability_read_word(pdev, PCI_EXP_RTCAP, &root_cap);
>> + if (root_cap & PCI_EXP_RTCAP_CRSVIS)
>> + pcie_capability_set_word(pdev, PCI_EXP_RTCTL,
>> + PCI_EXP_RTCTL_CRSSVE);
>> +}
>> +
>> /*
>> * If it's a bridge, configure it and scan the bus behind it.
>> * For CardBus bridges, we don't scan behind as the devices will
>> @@ -787,6 +798,8 @@ int pci_scan_bridge(struct pci_bus *bus, struct pci_dev *dev, int max, int pass)
>> pci_write_config_word(dev, PCI_BRIDGE_CONTROL,
>> bctl & ~PCI_BRIDGE_CTL_MASTER_ABORT);
>>
>> + pci_enable_crs(dev);
>> +
>> if ((secondary || subordinate) && !pcibios_assign_all_busses() &&
>> !is_cardbus && !broken) {
>> unsigned int cmax;
>> diff --git a/include/uapi/linux/pci_regs.h b/include/uapi/linux/pci_regs.h
>> index 30db069..a75106d 100644
>> --- a/include/uapi/linux/pci_regs.h
>> +++ b/include/uapi/linux/pci_regs.h
>> @@ -552,6 +552,7 @@
>> #define PCI_EXP_RTCTL_PMEIE 0x0008 /* PME Interrupt Enable */
>> #define PCI_EXP_RTCTL_CRSSVE 0x0010 /* CRS Software Visibility Enable */
>> #define PCI_EXP_RTCAP 30 /* Root Capabilities */
>> +#define PCI_EXP_RTCAP_CRSVIS 0x0001 /* CRS software visibility capability */
>> #define PCI_EXP_RTSTA 32 /* Root Status */
>> #define PCI_EXP_RTSTA_PME 0x00010000 /* PME status */
>> #define PCI_EXP_RTSTA_PENDING 0x00020000 /* PME pending */
>> --
>> 1.7.9.5
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/