Re: [PATCH v2 3/3] PCI/ASPM: Add LTR sysfs attributes

From: Kai-Heng Feng
Date: Mon Oct 25 2021 - 22:29:07 EST


On Tue, Oct 26, 2021 at 1:31 AM Bjorn Helgaas <helgaas@xxxxxxxxxx> wrote:
>
> On Mon, Oct 25, 2021 at 06:33:50PM +0800, Kai-Heng Feng wrote:
> > On Thu, Oct 21, 2021 at 11:09 PM Bjorn Helgaas <helgaas@xxxxxxxxxx> wrote:
> > > On Thu, Oct 21, 2021 at 11:51:59AM +0800, Kai-Heng Feng wrote:
> > > > Sometimes BIOS may not be able to program ASPM and LTR settings, for
> > > > instance, the NVMe devices behind Intel VMD bridges. For this case, both
> > > > ASPM and LTR have to be enabled to have significant power saving.
> > > >
> > > > Since we still want POLICY_DEFAULT honor the default BIOS ASPM settings,
> > > > introduce LTR sysfs knobs so users can set max snoop and max nosnoop
> > > > latency manually or via udev rules.
> > >
> > > How is the user supposed to figure out what "max snoop" and "max
> > > nosnoop" values to program?
> >
> > Actually, the only way I know is to get the value from other OS.
>
> I don't see how this can be a workable solution. IIUC this is
> specifically related to ASPM L1.2. L1.2 depends on LTR (the max
> snoop/nosnoop values) and the TPOWER_ON and Common_Mode_Restore_Time
> values. PCIe r5.0, sec 5.5.4, says:
>
> Prior to setting either or both of the enable bits for L1.2, the
> values for TPOWER_ON, Common_Mode_Restore_Time, and, if the ASPM
> L1.2 Enable bit is to be Set, the LTR_L1.2_THRESHOLD (both Value and
> Scale fields) must be programmed.
>
> The TPOWER_ON and Common_Mode_Restore_Time fields must be programmed
> to the appropriate values based on the components and AC coupling
> capacitors used in the connection linking the two components. The
> determination of these values is design implementation specific.
>
> I do not think collecting values from some other OS is a reasonable
> way to set these. The values would apparently depend on the
> electrical design of the particular machine.

What if we fallback to the original approach and use the VMD driver to
enable ASPM and LTR values?
At least I think Intel should be able to provide correct values for their SoC.

>
> > > If we add this, I'm afraid we'll have people programming random things
> > > that seem to work but are not actually reliable.
> >
> > IMO users need to take full responsibility for own doings.
> > Also, it's already doable by using setpci...
>
> I don't think it currently does, but setpci should taint the kernel.
>
> If users want to write setpci scripts to fiddle with stuff, that's
> great, but that moves it outside the supportable realm. If we provide
> a sysfs interface to do this, then it becomes more of *our* problem to
> make it work correctly, and I don't think that's practical in this
> case.

OK.

>
> > If we don't want to add LTR sysfs, what other options do we have to
> > enable VMD ASPM and LTR by default since BIOS doesn't do it for us?
> > 1) Enable it in the PCI or VMD driver, however this approach violates
> > POLICY_DEFAULT.
> > 2) Use `setpci` directly in udev rules to enable VMD ASPM and LTR.
> >
> > I think 2) is bad, and since 1) isn't so good either, the approach in
> > this patch may be the best compromise.
>
> I do not know how to safely enable L1.2. It's likely that I'm just
> missing something, but I don't see enough information in PCI config
> space and the PCI Firmware interface (_DSM for Latency Tolerance
> Reporting) to enable L1.2. It's possible that a new firmware
> interface is required.

I was told by Intel that they didn't use _DSM to get LTR values, at
least not the VMD case.

>
> I don't think it's wise to enable L1.2 unless we have good confidence
> that we know how to do it correctly. It's hard enough to debug ASPM
> issues as it is.

So what other options do we have if we want to enable VMD ASPM while
keeping CONFIG_PCIEASPM_DEFAULT=y?
Right now we enabled the VMD ASPM/LTR bits in downstream kernel, but
other distro users may want to have upstream support for this.

Kai-Heng

>
> Bjorn