Re: [PATCH net-next v9 3/5] r8169: Consider chip-specific ASPM can be enabled on more cases

From: Bjorn Helgaas
Date: Thu Mar 09 2023 - 15:17:17 EST


On Sat, Feb 25, 2023 at 11:46:33AM +0800, Kai-Heng Feng wrote:
> To really enable ASPM on r8169 NICs, both standard PCIe ASPM and
> chip-specific ASPM have to be enabled at the same time.
>
> Before enabling ASPM at chip side, make sure the following conditions
> are met:
> 1) Use pcie_aspm_support_enabled() to check if ASPM is disabled by
> kernel parameter.
> 2) Use pcie_aspm_capable() to see if the device is capable to perform
> PCIe ASPM.
> 3) Check the return value of pci_disable_link_state(). If it's -EPERM,
> it means BIOS doesn't grant ASPM control to OS, and device should use
> the ASPM setting as is.
>
> Consider ASPM is manageable when those conditions are met.
>
> While at it, disable ASPM at chip-side for TX timeout reset, since
> pci_disable_link_state() doesn't have any effect when OS isn't granted
> with ASPM control.

1) "While at it, ..." is always a hint that maybe this part could be
split to a separate patch.

2) The mix of chip-specific and standard PCIe ASPM configuration is a
mess. Does it *have* to be intermixed at run-time, or could all the
chip-specific stuff be done once, e.g., maybe chip-specific ASPM
enable could be done at probe-time, and then all subsequent ASPM
configuration could done via the standard PCIe registers?

I.e., does the chip work correctly if chip-specific ASPM is enabled,
but standard PCIe ASPM config is *disabled*?

The ASPM sysfs controls [1] assume that L0s, L1, L1.1, L1.2 can all be
controlled simply by using the standard PCIe registers. If that's not
the case for r8169, things will break when people use the sysfs knobs.

Bjorn

[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/ABI/testing/sysfs-bus-pci?id=v6.2#n420

> Signed-off-by: Kai-Heng Feng <kai.heng.feng@xxxxxxxxxxxxx>
> ---
> v9:
> - No change.
>
> v8:
> - Enable chip-side ASPM only when PCIe ASPM is already available.
> - Wording.
>
> v7:
> - No change.
>
> v6:
> - Unconditionally enable chip-specific ASPM.
>
> v5:
> - New patch.
>
> drivers/net/ethernet/realtek/r8169_main.c | 22 ++++++++++++++++++----
> 1 file changed, 18 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/net/ethernet/realtek/r8169_main.c b/drivers/net/ethernet/realtek/r8169_main.c
> index 45147a1016bec..a857650c2e82b 100644
> --- a/drivers/net/ethernet/realtek/r8169_main.c
> +++ b/drivers/net/ethernet/realtek/r8169_main.c
> @@ -2675,8 +2675,11 @@ static void rtl_disable_exit_l1(struct rtl8169_private *tp)
>
> static void rtl_hw_aspm_clkreq_enable(struct rtl8169_private *tp, bool enable)
> {
> - /* Don't enable ASPM in the chip if OS can't control ASPM */
> - if (enable && tp->aspm_manageable) {
> + /* Skip if PCIe ASPM isn't possible */
> + if (!tp->aspm_manageable)
> + return;
> +
> + if (enable) {
> RTL_W8(tp, Config5, RTL_R8(tp, Config5) | ASPM_en);
> RTL_W8(tp, Config2, RTL_R8(tp, Config2) | ClkReqEn);
>
> @@ -4545,8 +4548,13 @@ static void rtl_task(struct work_struct *work)
> /* ASPM compatibility issues are a typical reason for tx timeouts */
> ret = pci_disable_link_state(tp->pci_dev, PCIE_LINK_STATE_L1 |
> PCIE_LINK_STATE_L0S);
> +
> + /* OS may not be granted to control PCIe ASPM, prevent the driver from using it */
> + tp->aspm_manageable = 0;
> +
> if (!ret)
> netdev_warn_once(tp->dev, "ASPM disabled on Tx timeout\n");
> +
> goto reset;
> }
>
> @@ -5227,13 +5235,19 @@ static int rtl_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
> * Chips from RTL8168h partially have issues with L1.2, but seem
> * to work fine with L1 and L1.1.
> */
> - if (rtl_aspm_is_safe(tp))
> + if (!pcie_aspm_support_enabled() || !pcie_aspm_capable(pdev))
> + rc = -EINVAL;
> + else if (rtl_aspm_is_safe(tp))
> rc = 0;
> else if (tp->mac_version >= RTL_GIGA_MAC_VER_46)
> rc = pci_disable_link_state(pdev, PCIE_LINK_STATE_L1_2);
> else
> rc = pci_disable_link_state(pdev, PCIE_LINK_STATE_L1);
> - tp->aspm_manageable = !rc;
> +
> + /* -EPERM means BIOS doesn't grant OS ASPM control, ASPM should be use
> + * as is. Honor it.
> + */
> + tp->aspm_manageable = (rc == -EPERM) ? 1 : !rc;
>
> tp->dash_type = rtl_check_dash(tp);
>
> --
> 2.34.1
>