Re: [PATCH] mwifiex: Add quirk resetting the PCI bridge on MS Surface devices

From: Bjorn Helgaas
Date: Mon Oct 25 2021 - 19:56:38 EST


On Mon, Oct 25, 2021 at 06:45:29PM +0200, Jonas Dreßler wrote:
> On 10/18/21 17:35, Bjorn Helgaas wrote:
> > On Thu, Oct 14, 2021 at 12:08:31AM +0200, Jonas Dreßler wrote:
> > > On 10/12/21 17:39, Bjorn Helgaas wrote:
> > > > [+cc Vidya, Victor, ASPM L1.2 config issue; beginning of thread:
> > > > https://lore.kernel.org/all/20211011134238.16551-1-verdre@xxxxxxx/]
> >
> > > > I wonder if this reset quirk works because pci_reset_function() saves
> > > > and restores much of config space, but it currently does *not* restore
> > > > the L1 PM Substates capability, so those T_POWER_ON,
> > > > Common_Mode_Restore_Time, and LTR_L1.2_THRESHOLD values probably get
> > > > cleared out by the reset. We did briefly save/restore it [1], but we
> > > > had to revert that because of a regression that AFAIK was never
> > > > resolved [2]. I expect we will eventually save/restore this, so if
> > > > the quirk depends on it *not* being restored, that would be a problem.
> > > >
> > > > You should be able to test whether this is the critical thing by
> > > > clearing those registers with setpci instead of doing the reset. Per
> > > > spec, they can only be modified when L1.2 is disabled, so you would
> > > > have to disable it via sysfs (for the endpoint, I think)
> > > > /sys/.../l1_2_aspm and /sys/.../l1_2_pcipm, do the setpci on the root
> > > > port, then re-enable L1.2.
> > > >
> > > > [1] https://git.kernel.org/linus/4257f7e008ea
> > > > [2] https://lore.kernel.org/all/20210127160449.2990506-1-helgaas@xxxxxxxxxx/
> > >
> > > Hmm, interesting, thanks for those links.
> > >
> > > Are you sure the config values will get lost on the reset? If we
> > > only reset the port by going into D3hot and back into D0, the
> > > device will remain powered and won't lose the config space, will
> > > it?
> >
> > I think you're doing a PM reset (transition to D3hot and back to
> > D0). Linux only does this when PCI_PM_CTRL_NO_SOFT_RESET == 0.
> > The spec doesn't actually *require* the device to be reset; it
> > only says the internal state of the device is undefined after
> > these transitions.
>
> Not requiring the device to be reset sounds sensible to me given
> that D3hot is what devices are transitioned into during suspend.
>
> But anyway, that doesn't really get us any further except it
> somewhat gives an explanation why the LTR is suddenly 0 after the
> reset. Or are you making the point that we shouldn't rely on
> "undefined state" for this hack because not all PCI bridges/ports
> will necessarily behave the same?

I guess I'm just making the point that I don't understand why the
bridge reset fixes something, and I'm not confident that the fix will
work on every system and continue working even if/when the PCI core
starts saving and restoring the L1 PM Substates capability.