Re: [PATCH] r8169: Reinstate ALDPS and ASPM support

From: Bjorn Helgaas
Date: Tue Jun 05 2018 - 13:28:21 EST


On Tue, Jun 05, 2018 at 06:34:09AM +0000, Ryankao wrote:
> Add realtek folk Hau
>
> -----Original Message-----
> From: Kai Heng Feng [mailto:kai.heng.feng@xxxxxxxxxxxxx]
> Sent: Tuesday, June 05, 2018 1:02 PM
> To: jrg.otte@xxxxxxxxx
> Cc: David Miller <davem@xxxxxxxxxxxxx>; Hayes Wang <hayeswang@xxxxxxxxxxx>; hkallweit1@xxxxxxxxx; romieu@xxxxxxxxxxxxx; Linux Netdev List <netdev@xxxxxxxxxxxxxxx>; Linux Kernel Mailing List <linux-kernel@xxxxxxxxxxxxxxx>; Ryankao <ryankao@xxxxxxxxxxx>
> Subject: Re: [PATCH] r8169: Reinstate ALDPS and ASPM support
>
> Hi Jörg Otte,
>
> Can you give this patch a try?
>
> Since you are the only one that reported ALDPS/ASPM regression,
>
> And I think this patch should solve the issue you had [1].
>
> Hopefully we don't need to go down the rabbit hole of blacklist/whitelist...
>
> Kai-Heng
>
> [1] https://lkml.org/lkml/2013/1/5/36

I have no idea what ALDPS is. It's not mentioned in the PCIe spec, so
presumably it's some Realtek-specific thing. ASPM is a generic PCIe
thing. Changes to these two things should be in separate patches so
they don't get tangled up.

> > On Jun 5, 2018, at 12:58 PM, Kai-Heng Feng
> > <kai.heng.feng@xxxxxxxxxxxxx>
> > wrote:
> >
> > This patch reinstate ALDPS and ASPM support on r8169.
> >
> > On some Intel platforms, ASPM support on r8169 is the key factor to
> > let Package C-State achieve PC8. Without ASPM support, the deepest
> > Package C-State can hit is PC3. PC8 can save additional ~3W in
> > comparison with PC3.
> >
> > This patch is from Realtek.
> >
> > Fixes: e0c075577965 ("r8169: enable ALDPS for power saving")
> > Fixes: d64ec841517a ("r8169: enable internal ASPM and clock request
> > settings")

> > +3507,15 @@ static void rtl8168e_1_hw_phy_config(struct
> > rtl8169_private *tp)
> > rtl_writephy(tp, 0x0d, 0x4007);
> > rtl_writephy(tp, 0x0e, 0x0000);
> > rtl_writephy(tp, 0x0d, 0x0000);
> > +
> > + /* Check ALDPS bit, disable it if enabled */
> > + rtl_writephy(tp, 0x1f, 0x0000);
> > + if (enable_aldps)
> > + rtl_w0w1_phy(tp, 0x15, 0x1000, 0x0000);
> > + else if (rtl_readphy(tp, 0x15) & 0x1000)
> > + rtl_w0w1_phy(tp, 0x15, 0x0000, 0x1000);

There's a lot of repetition of this code with minor variations. You
could probably factor it out and make it more concise and more
readable.

> > +static void rtl8169_check_link_status(struct net_device *dev,
> > + struct rtl8169_private *tp) {
> > + struct device *d = tp_to_dev(tp);
> > +
> > + if (tp->link_ok(tp)) {
> > + rtl_link_chg_patch(tp);
> > + /* This is to cancel a scheduled suspend if there's one. */
> > + if (pm_request_resume(d))
> > + _rtl_reset_work(tp);
> > + netif_carrier_on(dev);
> > + if (net_ratelimit())
> > + netif_info(tp, ifup, dev, "link up\n");
> > + } else {
> > + netif_carrier_off(dev);
> > + netif_info(tp, ifdown, dev, "link down\n");
> > + pm_runtime_idle(d);
> > + }
> > +}

This function apparently just got moved around without changing
anything. That's fine, but the move should be in a separate patch to
make the real changes easier to review.

> > @@ -7649,8 +7757,12 @@ static int rtl_init_one(struct pci_dev *pdev,
> > const struct pci_device_id *ent)
> >
> > /* disable ASPM completely as that cause random device stop working
> > * problems as well as full system hangs for some PCIe devices users */
> > - pci_disable_link_state(pdev, PCIE_LINK_STATE_L0S | PCIE_LINK_STATE_L1 |
> > - PCIE_LINK_STATE_CLKPM);
> > + if (!enable_aspm) {
> > + pci_disable_link_state(pdev, PCIE_LINK_STATE_L0S |
> > + PCIE_LINK_STATE_L1 |
> > + PCIE_LINK_STATE_CLKPM);
> > + netif_info(tp, probe, dev, "ASPM disabled\n");
> > + }

ASPM is a generic PCIe feature that should be configured by the PCI
core without any help from the device driver.

If code in the driver is needed, that means either the PCI core is
doing it wrong and we should fix it there, or the device is broken and
the driver is working around the erratum.

If this is an erratum, you should include details about exactly what's
broken and (ideally) a URL to the published erratum. Otherwise this
is just unmaintainable black magic and likely to be broken by future
ASPM changes in the PCI core.

ASPM configuration is done by the PCI core before drivers are bound to
the device. If you need device-specific workarounds, they should
probably be in quirks so they're done before the core does that ASPM
configuration.

> > /* enable device (incl. PCI PM wakeup and hotplug setup) */
> > rc = pcim_enable_device(pdev);
> > --
> > 2.17.0
>
> ------Please consider the environment before printing this e-mail.