Re: [PATCH v2 9/9] wifi: ath10k: Use RMW accessors for changing LNKCTL

From: Bjorn Helgaas
Date: Fri May 26 2023 - 18:21:24 EST


On Thu, May 25, 2023 at 01:11:51PM +0300, Ilpo Järvinen wrote:
> On Wed, 24 May 2023, Bjorn Helgaas wrote:
> > On Wed, May 17, 2023 at 01:52:35PM +0300, Ilpo Järvinen wrote:
> > > Don't assume that only the driver would be accessing LNKCTL. ASPM
> > > policy changes can trigger write to LNKCTL outside of driver's control.
> > >
> > > Use RMW capability accessors which does proper locking to avoid losing
> > > concurrent updates to the register value. On restore, clear the ASPMC
> > > field properly.
> > >
> > > Fixes: 76d870ed09ab ("ath10k: enable ASPM")
> > > Suggested-by: Lukas Wunner <lukas@xxxxxxxxx>
> > > Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@xxxxxxxxxxxxxxx>
> > > Cc: stable@xxxxxxxxxxxxxxx
> > > ---
> > > drivers/net/wireless/ath/ath10k/pci.c | 9 +++++----
> > > 1 file changed, 5 insertions(+), 4 deletions(-)
> > >
> > > diff --git a/drivers/net/wireless/ath/ath10k/pci.c b/drivers/net/wireless/ath/ath10k/pci.c
> > > index a7f44f6335fb..9275a672f90c 100644
> > > --- a/drivers/net/wireless/ath/ath10k/pci.c
> > > +++ b/drivers/net/wireless/ath/ath10k/pci.c
> > > @@ -1963,8 +1963,9 @@ static int ath10k_pci_hif_start(struct ath10k *ar)
> > > ath10k_pci_irq_enable(ar);
> > > ath10k_pci_rx_post(ar);
> > >
> > > - pcie_capability_write_word(ar_pci->pdev, PCI_EXP_LNKCTL,
> > > - ar_pci->link_ctl);
> > > + pcie_capability_clear_and_set_word(ar_pci->pdev, PCI_EXP_LNKCTL,
> > > + PCI_EXP_LNKCTL_ASPMC,
> > > + ar_pci->link_ctl & PCI_EXP_LNKCTL_ASPMC);
> > >
> > > return 0;
> > > }
> > > @@ -2821,8 +2822,8 @@ static int ath10k_pci_hif_power_up(struct ath10k *ar,
> > >
> > > pcie_capability_read_word(ar_pci->pdev, PCI_EXP_LNKCTL,
> > > &ar_pci->link_ctl);
> > > - pcie_capability_write_word(ar_pci->pdev, PCI_EXP_LNKCTL,
> > > - ar_pci->link_ctl & ~PCI_EXP_LNKCTL_ASPMC);
> > > + pcie_capability_clear_word(ar_pci->pdev, PCI_EXP_LNKCTL,
> > > + PCI_EXP_LNKCTL_ASPMC);
> >
> > These ath drivers all have the form:
> >
> > 1) read LNKCTL
> > 2) save LNKCTL value in ->link_ctl
> > 3) write LNKCTL with "->link_ctl & ~PCI_EXP_LNKCTL_ASPMC"
> > to disable ASPM
> > 4) write LNKCTL with ->link_ctl, presumably to re-enable ASPM
> >
> > These patches close the hole between 1) and 3) where other LNKCTL
> > updates could interfere, which is definitely a good thing.
> >
> > But the hole between 1) and 4) is much bigger and still there. Any
> > update by the PCI core in that interval would be lost.
>
> Any update to PCI_EXP_LNKCTL_ASPMC field in that interval is lost yes, the
> updates to _the other fields_ in LNKCTL are not lost.

Ah, yes, you're right, I missed the masking to PCI_EXP_LNKCTL_ASPMC in
the pcie_capability_clear_word().

> > Straw-man proposal:
> >
> > - Change pci_disable_link_state() so it ignores aspm_disabled and
> > always disables ASPM even if platform firmware hasn't granted
> > ownership. Maybe this should warn and taint the kernel.
> >
> > - Change drivers to use pci_disable_link_state() instead of writing
> > LNKCTL directly.
>
> I fully agree that's the direction we should be moving, yes. However, I'm
> a bit hesitant to take that leap in one step. These drivers currently not
> only disable ASPM but also re-enable it (assuming we guessed the intent
> right).
>
> If I directly implement that proposal, ASPM is not going to be re-enabled
> when PCI core does not allowing it. Could it cause some power related
> regression?

IIUC the potential problem only happens with:

- A platform that enables ASPM but doesn't grant PCIe Capability
ownership to the OS, and

- A device where we force-disable ASPM, presumably to avoid some
hardware defect.

I'm not sure this case is worth worrying about. A platform that
enables ASPM without allowing the OS to disable it is taking a risk
because it can't know about these device defects or even about user
preferences. A device that has an ASPM-related defect may use more
power than necessary. I think that's to be expected.

> My plan is to make another patch series after these to realize exactly
> what you're proposing. It would allow better to isolate the problems that
> related to the lack of ASPM.
>
> I hope this two step approach is an acceptable way forward? I can of
> course add those patches on top of these if that would be preferrable.

I think two steps is OK. It's a little more work for the driver
maintainers to review them, but this step is pretty trivial already
reviewed (except for the GPUs, which are probably the most important :)).

Bjorn