Re: rtsx_pci not restoring ASPM state after suspend/resume

From: Bjorn Helgaas
Date: Mon Jul 27 2020 - 17:47:16 EST


On Mon, Jul 27, 2020 at 08:52:25PM +0100, James Ettle wrote:
> On Mon, 2020-07-27 at 09:14 -0500, Bjorn Helgaas wrote:
> > I don't know the connection between ASPM and package C-states, so I
> > need to simplify this even more. All I want to do right now is
> > verify
> > that if we don't have any outside influences on the ASPM
> > configuration
> > (eg, no manual changes and no udev rules), it stays the same across
> > suspend/resume.
>
> Basically this started from me observing deep package C-states weren't
> being used, until I went and fiddled with the ASPM state of the
> rtsx_pci card reader under sysfs -- so phenomenological poking on my
> part.
>
> > So let's read the ASPM state directly from the
> > hardware like this:
> >
> > sudo lspci -vvs 00:1d.0 | egrep "^0|Lnk|L1|LTR|snoop"
> > sudo lspci -vvs 01:00 | egrep "^0|Lnk|L1|LTR|snoop"
> >
> > Can you try that before and after suspend/resume?
>
> I've attached these to the bugzilla entry at:
>
> https://bugzilla.kernel.org/show_bug.cgi?id=208117
>
> Spoiler: With no udev rules or suspend hooks, things are the same
> before and after suspend/resume. One thing I do see (both before and
> after) is that ASPM L0s and L1 is enabled for the card reader, but
> disabled for the ethernet chip (does r8169 fiddle with ASPM too?).

Thank you! It's good that this stays the same across suspend/resume.
Do you see different C-state behavior before vs after?

This is the config I see:

00:1d.0 bridge to [bus 01]: ASPM L1 supported; ASPM Disabled
01:00.0 card reader: ASPM L0s L1 supported; L0s L1 Enabled
01:00.1 GigE NIC: ASPM L0s L1 supported; ASPM Disabled

This is actually illegal because PCIe r5.0, sec 5.4.1.3, says software
must not enable L0s in either direction unless components on both ends
of the link support L0s. The bridge (00:1d.0) does not support L0s,
so it's illegal to enable L0s on 01:00.0. I don't know whether this
causes problems in practice.

I don't see anything in rtsx that enables L0s. Can you collect the
dmesg log when booting with "pci=earlydump"? That will show whether
the BIOS left it this way. The PCI core isn't supposed to do this, so
if it did, we need to fix that.

I don't know whether r8169 mucks with ASPM. It is legal to have
different configurations for the two functions, even though they share
the same link. Sec 5.4.1 has rules about how hardware resolves
differences.

> [Oddly when I set ASPM (e.g. using udev) the lspci tools show ASPM
> enabled after a suspend/resume, but still no deep package C-states
> until I manually fiddle via sysfs on the card reader. Sorry if this
> only muddies the water further!]

Let's defer this for now. It sounds worth pursuing, but I can't keep
everything in my head at once.

Bjorn