Re: [Intel-wired-lan] [PATCH] PCI/ASPM: Enable ASPM on external PCIe devices

From: Mario Limonciello
Date: Fri Jul 14 2023 - 12:37:42 EST


On 7/14/23 03:17, Kai-Heng Feng wrote:
On Thu, Jul 6, 2023 at 12:07 PM Mario Limonciello
<mario.limonciello@xxxxxxx> wrote:

On 7/5/23 15:06, Bjorn Helgaas wrote:
On Wed, Jun 28, 2023 at 01:09:49PM +0800, Kai-Heng Feng wrote:
On Wed, Jun 28, 2023 at 4:54 AM Bjorn Helgaas <helgaas@xxxxxxxxxx> wrote:
On Tue, Jun 27, 2023 at 04:35:25PM +0800, Kai-Heng Feng wrote:
On Fri, Jun 23, 2023 at 7:06 AM Bjorn Helgaas <helgaas@xxxxxxxxxx> wrote:
On Tue, Jun 20, 2023 at 01:36:59PM -0500, Limonciello, Mario wrote:

It's perfectly fine for the IP to support PCI features that are not
and can not be enabled in a system design. But I expect that
strapping or firmware would disable those features so they are not
advertised in config space.

If BIOS leaves features disabled because they cannot work, but at the
same time leaves them advertised in config space, I'd say that's a
BIOS defect. In that case, we should have a DMI quirk or something to
work around the defect.

That means most if not all BIOS are defected.
BIOS vendors and ODM never bothered (and probably will not) to change
the capabilities advertised by config space because "it already works
under Windows".

This is what seems strange to me. Are you saying that Windows never
enables these power-saving features? Or that Windows includes quirks
for all these broken BIOSes? Neither idea seems very convincing.


I see your point. I was looking through Microsoft documentation for
hints and came across this:

https://learn.microsoft.com/en-us/windows-hardware/customize/power-settings/pci-express-settings-link-state-power-management

They have a policy knob to globally set L0 or L1 for PCIe links.

They don't explicitly say it, but surely it's based on what the devices
advertise in the capabilities registers.

So essentially it can be achieved via boot time kernel parameter
and/or sysfs knob.

The main point is OS should stick to the BIOS default, which is the
only ASPM setting tested before putting hardware to the market.

Unfortunately; I don't think you can jump to this conclusion.

A big difference in the Windows world to Linux world is that OEMs ship with a factory Windows image that may set policies like this. OEM "platform" drivers can set registry keys too.

I think the next ASPM issue that comes up what we (collectively) need to do is compare ASPM policy and PCI registers for:
1) A "clean" Windows install from Microsoft media before all the OEM drivers are installed.
2) A Windows install that the drivers have been installed.
3) A up to date mainline Linux kernel.

Actually as this thread started for determining policy for external PCIe devices, maybe that would be good to check with those.


Kai-Heng


So the logic is to ignore the capability and trust the default set
by BIOS.

I think limiting ASPM support to whatever BIOS configured at boot-time
is problematic. I don't think we can assume that all platforms have
firmware that configures ASPM as aggressively as possible, and
obviously firmware won't configure hot-added devices at all (in
general; I know ACPI _HPX can do some of that).

Totally agree. I was not suggesting to limiting the setting at all.
A boot-time parameter to flip ASPM setting is very useful. If none has
been set, default to BIOS setting.

A boot-time parameter for debugging and workarounds is fine. IMO,
needing a boot-time parameter in the course of normal operation is
not OK.

Bjorn