Re: /sys/module/pcie_aspm/parameters/policy not writable?

From: Bjorn Helgaas
Date: Wed Jul 10 2013 - 15:57:32 EST


[+cc Jeff, Jesse, et al, e1000-devel]

Holy cow, you guys have a lot of folks listed in MAINTAINERS for Intel
drivers :) This is an ASPM question, if that helps narrow down the
folks interested.

On Wed, Jul 10, 2013 at 7:29 AM, Pavel Machek <pavel@xxxxxx> wrote:
> Hi!
>
>> >> But:
>> >> 1) it should not list unavailable options
>> >>
>> >> 2) operation not permitted seems like wrong error code for
>> >> operation not supported.
>> >
>> > So I forcibly enabled ASPM, and now ping latencies are in normal
>> > range... no matter how I set
>> > /sys/module/pcie_aspm/parameters/policy. Strange.
>> >
>> > Any ideas what correct solution is?
>> > Pavel
>> > Signed-off-by: Pavel Machek <pavel@xxxxxx>
>> > (but don't apply)
> ...
>> > diff --git a/drivers/pci/pci-acpi.c b/drivers/pci/pci-acpi.c
>> > index e4b1fb2..9a1b63e 100644
>> > --- a/drivers/pci/pci-acpi.c
>> > +++ b/drivers/pci/pci-acpi.c
>> > @@ -382,7 +382,7 @@ static int __init acpi_pci_init(void)
>> >
>> > if (acpi_gbl_FADT.boot_flags & ACPI_FADT_NO_ASPM) {
>> > printk(KERN_INFO"ACPI FADT declares the system doesn't support PCIe ASPM, so disable it\n");
>> > - pcie_no_aspm();
>> > +// pcie_no_aspm();
>> > }
>> >
>> > ret = register_acpi_bus_type(&acpi_pci_bus);
>>
>> Hi Pavel,
>>
>> Interesting. Can you collect dmesg and "lspci -vvv" output for both
>> cases (high ping latency and normal ping latency)?
>
> Will do. Results are in attachment (200KB...)
>
>> Also, how much
>> difference does this make in ping latency?
>
> The ping latency goes from 100msec range to <2msec.
>
>> If ASPM is enabled for a
>> device, e.g., your NIC, the link may be put in a low power state when
>> the device is idle. It takes time to exit that low power state, of
>> course, but I would expect that time to be in the microsecond time and
>> probably not observable via ping.
>
> I'd hope so. 100msec ping makes ssh unpleasant to use.

Pavel's ThinkPad X60 has two NICs: Intel 82573L and Intel PRO/Wireless
3945ABG. I'm pretty sure the problem he's reporting is with the
82573L. Ping times are bad (~100msec) when ASPM is enabled, as
reported by lspci.

On Pavel's system, the FADT says we shouldn't enable OSPM control of
ASPM (ACPI_FADT_NO_ASPM is set), so we set "aspm_disabled = 1". One
effect is that we don't blacklist the pre-1.1 82573L device, which I
think results in it being left with the BIOS configuration, which
apparently has ASPM enabled. (Pavel, could you confirm the BIOS
config, e.g., with "pci=earlydump"?)

e1000e claims to disable ASPM, but because aspm_disabled is set, the
driver's call to pci_disable_link_state_locked() actually does nothing
[1].

I experimented [2] with Windows and found that when a driver requests
PciASPMOptOut, Windows will not touch ASPM config if the _OSC method
fails, i.e., the BIOS declines to grant ASPM control to the OS.
However, I do not know if Windows similarly ignores PciASPMOptOut when
the FADT ACPI_FADT_NO_ASPM bit is set.

The PCI core has failed spectacularly at providing useful ASPM
interfaces. Do you Intel folks have any suggestions about how to
resolve this? I assume that the Windows driver for the 82573L must
disable ASPM somehow, even though ACPI_FADT_NO_ASPM is set. Does it
just use brute-force, as in the version of __e1000e_disable_aspm()
that's used when CONFIG_PCIEASPM is not set?

Bjorn

[1] We just merged 2add0ec1, which adds a "can't disable ASPM; OS
doesn't have ASPM control" message in this case, but I don't think
Pavel's kernel has this change. It doesn't change the behavior
anyway.

[2] https://bugzilla.kernel.org/show_bug.cgi?id=57331
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/