Re: [PATCH] PCI: Disallow ASPM on ASMedia ASM1083/1085 PCIe-PCI bridge

From: Bjorn Helgaas
Date: Wed Jul 22 2020 - 13:40:14 EST


[+cc Puranjay]

On Tue, Jul 21, 2020 at 08:18:03PM -0600, Robert Hancock wrote:
> Recently ASPM handling was changed to no longer disable ASPM on all
> PCIe to PCI bridges. Unfortunately these ASMedia PCIe to PCI bridge
> devices don't seem to function properly with ASPM enabled, as they
> cause the parent PCIe root port to cause repeated AER timeout errors.
> In addition to flooding the kernel log, this also causes the machine
> to wake up immediately after suspend is initiated.

Hi Robert, thanks a lot for the report of this problem
(https://lore.kernel.org/r/CADLC3L1R2hssRjxHJv9yhdN_7-hGw58rXSfNp-FraZh0Tw+gRw@xxxxxxxxxxxxxx
and https://bugzilla.redhat.com/show_bug.cgi?id=1853960).

I'm pretty sure Linux ASPM support is missing some things. This
problem might be a hardware problem where a quirk is the right
solution, but it could also be that it's a result of a Linux defect
that we should fix.

Could you collect the dmesg log and "sudo lspci -vvxxxx" output
somewhere (maybe a bugzilla.kernel.org issue)? I want to figure out
whether this L1 PM substates are enabled on this link, and whether
that's configured correctly.

> Fixes: 66ff14e59e8a ("PCI/ASPM: Allow ASPM on links to PCIe-to-PCI/PCI-X Bridges")
> Cc: stable@xxxxxxxxxxxxxxx
> Signed-off-by: Robert Hancock <hancockrwd@xxxxxxxxx>
> ---
> drivers/pci/quirks.c | 13 +++++++++++++
> 1 file changed, 13 insertions(+)
>
> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> index 812bfc32ecb8..e5713114f2ab 100644
> --- a/drivers/pci/quirks.c
> +++ b/drivers/pci/quirks.c
> @@ -2330,6 +2330,19 @@ DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x10f1, quirk_disable_aspm_l0s);
> DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x10f4, quirk_disable_aspm_l0s);
> DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x1508, quirk_disable_aspm_l0s);
>
> +static void quirk_disable_aspm_l0s_l1(struct pci_dev *dev)
> +{
> + pci_info(dev, "Disabling ASPM L0s/L1\n");
> + pci_disable_link_state(dev, PCIE_LINK_STATE_L0S | PCIE_LINK_STATE_L1);
> +}
> +
> +/*
> + * ASM1083/1085 PCIe-PCI bridge devices cause AER timeout errors on the
> + * upstream PCIe root port when ASPM is enabled. At least L0s mode is affected,
> + * disable both L0s and L1 for now to be safe.
> + */
> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ASMEDIA, 0x1080, quirk_disable_aspm_l0s_l1);
> +
> /*
> * Some Pericom PCIe-to-PCI bridges in reverse mode need the PCIe Retrain
> * Link bit cleared after starting the link retrain process to allow this
> --
> 2.26.2
>