Re: [PATCH] PCI/ASPM: Make SUNIX serial card acceptable latency unlimited

From: Chris Chiu
Date: Mon Oct 03 2022 - 02:59:54 EST


On Fri, Sep 30, 2022 at 11:18 PM Bjorn Helgaas <helgaas@xxxxxxxxxx> wrote:
>
> On Fri, Sep 30, 2022 at 05:10:50PM +0800, Chris Chiu wrote:
> > SUNIX serial card advertise L1 acceptable L0S exit latency to be
> > < 2us, L1 < 32us, but the link capability shows they're unlimited.
> >
> > It fails the latency check and prohibits the ASPM L1 from being
> > enabled. The L1 acceptable latency quirk fixes the issue.
>
> Hi Chris, help me understand what's going on here.
>
> The "Endpoint L1 Acceptable Latency" field in Device Capabilities is
> described like this (PCIe r6.0, sec 7.5.3.3):
>
> This field indicates the acceptable latency that an Endpoint can
> withstand due to the transition from L1 state to the L0 state. It is
> essentially an indirect measure of the Endpoint’s internal
> buffering.
>
> Power management software uses the reported L1 Acceptable Latency
> number to compare against the L1 Exit Latencies reported (see below)
> by all components comprising the data path from this Endpoint to the
> Root Complex Root Port to determine whether ASPM L1 entry can be
> used with no loss of performance.
>
> The "L1 Exit Latency" in Link Capabilities:
>
> This field indicates the L1 Exit Latency for the given PCI Express
> Link. The value reported indicates the length of time this Port
> requires to complete transition from ASPM L1 to L0.
>
> Apparently the SUNIX device advertises in Dev Cap that it can tolerate
> a maximum of 32us of L1 Exit Latency for the entire path from the
> SUNIX device to the Root Port, and in Link Cap that the SUNIX device
> itself may take more than 64us to exit L1.
>
> If that's accurate, then we should not enable L1 for that device
> because using L1 may cause buffer overflows, e.g., dropped characters.
>
> Per 03038d84ace7 ("PCI/ASPM: Make Intel DG2 L1 acceptable latency
> unlimited"), the existing users of aspm_l1_acceptable_latency() are
> graphics devices where I assume there would be little data coming from
> the device and buffering would not be an issue.
>
> It doesn't seem plausible to me that a serial device, where there is a
> continuous stream of incoming data, could tolerate an *unlimited* exit
> latency.
>
> I could certainly believe that Link Cap advertises "> 64us" of L1 Exit
> Latency when it really should advertise "< 32us" or something. But I
> don't know how we could be confident in the correct value without
> knowledge of the device design.
>
> Bjorn

Hi Bjorn,
Thanks for the clear explanation. I understand your concern and
I'll try to reach the vendor for their comment about the device
design. But the value "unlimited" for L1 exit latency with specified
L1 acceptable latency on a self-claimed "ASPM L1" capable device
really looks weird to me, I'd rather assume the 32us limit in DevCap
is actually for LinkCap (L1 exit latency), and the acceptable latency
is actually unlimited.

I'll try to ask SUNIX if they intentionally program the latency
this way and expect the device is unlikely to enter ASPM L1. Or they
just accidentally program it with the incorrect value. Will keep
updating here. Thanks

Chris

>
> > Signed-off-by: Chris Chiu <chris.chiu@xxxxxxxxxxxxx>
> > ---
> > drivers/pci/quirks.c | 1 +
> > 1 file changed, 1 insertion(+)
> >
> > diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> > index 4944798e75b5..e1663e43846e 100644
> > --- a/drivers/pci/quirks.c
> > +++ b/drivers/pci/quirks.c
> > @@ -5955,4 +5955,5 @@ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x56b0, aspm_l1_acceptable_latency
> > DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x56b1, aspm_l1_acceptable_latency);
> > DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x56c0, aspm_l1_acceptable_latency);
> > DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x56c1, aspm_l1_acceptable_latency);
> > +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_SUNIX, PCI_DEVICE_ID_SUNIX_1999, aspm_l1_acceptable_latency);
> > #endif
> > --
> > 2.25.1
> >