Re: [PATCH] PCI: Add PCI quirk to disable L0s ASPM state for RTL8125 2.5GbE Controller
From: Heiner Kallweit
Date: Thu Mar 06 2025 - 17:59:45 EST
On 05.03.2025 23:20, Bjorn Helgaas wrote:
> [+cc r8169 maintainers, since upstream r8169 claims device 0x8125]
>
> On Wed, Mar 05, 2025 at 02:30:35PM +0800, hans.zhang@xxxxxxxxxxx wrote:
>> From: Hans Zhang <hans.zhang@xxxxxxxxxxx>
>>
>> This patch is intended to disable L0s ASPM link state for RTL8125 2.5GbE
>> Controller due to the fact that it is possible to corrupt TX data when
>> coming back out of L0s on some systems. This quirk uses the ASPM api to
>> prevent the ASPM subsystem from re-enabling the L0s state.
>
> Sounds like this should be a documented erratum. Realtek folks? Or
> maybe an erratum on the other end of the link, which looks like a CIX
> Root Port:
>
> https://admin.pci-ids.ucw.cz/read/PC/1f6c/0001
>
> If it's a CIX Root Port defect, it could affect devices other than
> RTL8125.
>
>> And it causes the following AER errors:
>> pcieport 0003:30:00.0: AER: Multiple Corrected error received: 0003:31:00.0
>> pcieport 0003:30:00.0: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Transmitter ID)
>> pcieport 0003:30:00.0: device [1f6c:0001] error status/mask=00001000/0000e000
>> pcieport 0003:30:00.0: [12] Timeout
>> r8125 0003:31:00.0: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Transmitter ID)
>> r8125 0003:31:00.0: device [10ec:8125] error status/mask=00001000/0000e000
>> r8125 0003:31:00.0: [12] Timeout
>> r8125 0003:31:00.0: AER: Error of this Agent is reported first
>
> Looks like a driver name of "r8125", but I don't see that upstream.
> Is this an out-of-tree driver?
>
Yes, this refers to Realtek's out-of-tree r8125 driver.
As stated by Hans, with the r8169 in-tree driver the issue doesn't occur.
>> And the RTL8125 website does not say that it supports L0s. It only supports
>> L1 and L1ss.
>>
>> RTL8125 website: https://www.realtek.com/Product/Index?id=3962
>
> I don't think it matters what the web site says. Apparently the
> device advertises L0s support via Link Capabilities.
>
>> Signed-off-by: Hans Zhang <hans.zhang@xxxxxxxxxxx>
>> Reviewed-by: Peter Chen <peter.chen@xxxxxxxxxxx>
>> ---
>> drivers/pci/quirks.c | 6 ++++++
>> 1 file changed, 6 insertions(+)
>>
>> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
>> index 82b21e34c545..5f69bb5ee3ff 100644
>> --- a/drivers/pci/quirks.c
>> +++ b/drivers/pci/quirks.c
>> @@ -2514,6 +2514,12 @@ DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x10f1, quirk_disable_aspm_l0s);
>> DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x10f4, quirk_disable_aspm_l0s);
>> DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x1508, quirk_disable_aspm_l0s);
>>
>> +/*
>> + * The RTL8125 may experience data corruption issues when transitioning out
>> + * of L0S. To prevent this we need to disable L0S on the PCIe link.
>> + */
>> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_REALTEK, 0x8125, quirk_disable_aspm_l0s);
>> +
>> static void quirk_disable_aspm_l0s_l1(struct pci_dev *dev)
>> {
>> pci_info(dev, "Disabling ASPM L0s/L1\n");
>>
>> base-commit: 99fa936e8e4f117d62f229003c9799686f74cebc
>> --
>> 2.47.1
>>