On Thu, Oct 14, 2021 at 12:08:31AM +0200, Jonas Dreßler wrote:
On 10/12/21 17:39, Bjorn Helgaas wrote:
[+cc Vidya, Victor, ASPM L1.2 config issue; beginning of thread:
https://lore.kernel.org/all/20211011134238.16551-1-verdre@xxxxxxx/]
I wonder if this reset quirk works because pci_reset_function() saves
and restores much of config space, but it currently does *not* restore
the L1 PM Substates capability, so those T_POWER_ON,
Common_Mode_Restore_Time, and LTR_L1.2_THRESHOLD values probably get
cleared out by the reset. We did briefly save/restore it [1], but we
had to revert that because of a regression that AFAIK was never
resolved [2]. I expect we will eventually save/restore this, so if
the quirk depends on it *not* being restored, that would be a problem.
You should be able to test whether this is the critical thing by
clearing those registers with setpci instead of doing the reset. Per
spec, they can only be modified when L1.2 is disabled, so you would
have to disable it via sysfs (for the endpoint, I think)
/sys/.../l1_2_aspm and /sys/.../l1_2_pcipm, do the setpci on the root
port, then re-enable L1.2.
[1] https://git.kernel.org/linus/4257f7e008ea
[2] https://lore.kernel.org/all/20210127160449.2990506-1-helgaas@xxxxxxxxxx/
Hmm, interesting, thanks for those links.
Are you sure the config values will get lost on the reset? If we only reset
the port by going into D3hot and back into D0, the device will remain powered
and won't lose the config space, will it?
I think you're doing a PM reset (transition to D3hot and back to D0).
Linux only does this when PCI_PM_CTRL_NO_SOFT_RESET == 0. The spec
doesn't actually *require* the device to be reset; it only says the
internal state of the device is undefined after these transitions.