Re: [PATCH V2] PCI/ASPM: Save/restore L1SS Capability for suspend/resume

From: Kai-Heng Feng
Date: Thu Aug 25 2022 - 23:13:23 EST


On Fri, Aug 26, 2022 at 7:01 AM Bjorn Helgaas <helgaas@xxxxxxxxxx> wrote:
>
> On Tue, Aug 23, 2022 at 10:55:01PM +0800, Kai-Heng Feng wrote:
> > On Tue, Aug 9, 2022 at 12:17 AM Vidya Sagar <vidyas@xxxxxxxxxx> wrote:
> > >
> > > Thanks Lukasz for the update.
> > > I think confirms that there is no issue with the patch as such.
> > > Bjorn, could you please define the next step for this patch?
> >
> > I think the L1SS cap went away _after_ L1SS registers are restored,
> > since your patch already check the cap before doing any write:
> > + aspm_l1ss = pci_find_ext_capability(dev, PCI_EXT_CAP_ID_L1SS);
> > + if (!aspm_l1ss)
> > + return;
> >
> > That means it's more likely to be caused by the following change:
> > + pci_write_config_dword(dev, aspm_l1ss + PCI_L1SS_CTL2, *cap++);
> > + pci_write_config_dword(dev, aspm_l1ss + PCI_L1SS_CTL1, *cap++);
> >
> > So is it possible to clear PCI_L1SS_CTL1 before setting PCI_L1SS_CTL2,
> > like what aspm_calc_l1ss_info() does?
>
> Sorry, I've totally lost track of where we are with this. I guess the
> object is to save/restore L1SS state.
>
> And there are two problems that aren't understood yet?
>
> 1) Lukasz's 01:00.0 wifi device didn't work immediately after
> resume, but seemed to be hot-added later? [1]
>
> 2) The 00:14.0 Root Port L1SS capability was present before
> suspend/resume but not after? [2,3]
>
> I thought Lukasz's latest emails [4,5] indicated that problem 1) still
> happened and presumably only happens with Vidya's patch, and 2) also
> still happens, but happens even *without* Vidya's patch. Do I have
> that right?

Thanks, so root port already losing its L1SS cap before applying the patch.

>
> If adding the patch causes 1), obviously we would need to fix that.
> It would certainly be good to understand 2) as well, but I guess if
> that's a pre-existing problem, ...

I wonder if checking parent device's L1SS cap in
pci_restore_aspm_l1ss_state() a good workaround?

If this is indeed a firmware side issue, it explains why Kenneth's XPS
doesn't have this issue anymore after some BIOS updates.

Kai-Heng

>
> Bjorn
>
> [1] https://gist.github.com/semihalf-majczak-lukasz/fb36dfa2eff22911109dfb91ab0fc0e3#file-dmesg-L1762
> [2] https://gist.github.com/semihalf-majczak-lukasz/fb36dfa2eff22911109dfb91ab0fc0e3#file-lspci-before-suspend-log-L136
> [3] https://gist.github.com/semihalf-majczak-lukasz/fb36dfa2eff22911109dfb91ab0fc0e3#file-lspci-after-suspend-log-L136
> [4] https://lore.kernel.org/r/CAFJ_xbr5NjoV1jC3P93N4UgooUuNdCRnrX7HuK=xLtPM5y7EjA@xxxxxxxxxxxxxx
> [5] https://lore.kernel.org/r/CAFJ_xboyQyEaDeQ+pZH_YqN52-ALGNqzmmzeyNt6X_Cz-c1w9Q@xxxxxxxxxxxxxx