Re: [PATCH 3/4] PCI: qcom: Indicate broken L1ss exit during resume from system suspend
From: Manivannan Sadhasivam
Date: Sat Apr 18 2026 - 01:42:00 EST
On Fri, Apr 17, 2026 at 05:26:15PM -0500, Bjorn Helgaas wrote:
> On Fri, Apr 17, 2026 at 05:36:42PM +0530, Manivannan Sadhasivam wrote:
> > On Thu, Apr 16, 2026 at 02:20:00PM -0500, Bjorn Helgaas wrote:
> > > On Tue, Apr 14, 2026 at 09:29:41PM +0530, Manivannan Sadhasivam via B4 Relay wrote:
> > > > From: Manivannan Sadhasivam <manivannan.sadhasivam@xxxxxxxxxxxxxxxx>
> > > >
> > > > Qcom PCIe RCs can successfully exit from L1ss during OS runtime.
> > > > However, during system suspend, the Qcom PCIe RC driver may
> > > > remove all resource votes and turns off the PHY to maximize
> > > > power savings.
> > > >
> > > > Consequently, when the host is in system suspend with the link
> > > > in L1ss and the endpoint asserts CLKREQ#, the OS must first wake
> > > > up and the RC driver must restore the PHY and enable the refclk.
> > > > This recovery process causes the strict L1ss exit latency time
> > > > to be exceeded. (If the RC driver were to retain all votes
> > > > during suspend, L1ss exit would succeed without issue, but at
> > > > the expense of higher power consumption).
> > >
> > > I don't think the link can be in L1.x if the PHY is turned off,
> > > can it? I assume if the PHY is off, the link would be in L2 (if
> > > aux power is available) or L3.
> >
> > As per the spec, if the link is in L1.2, the entire analog circuitry
> > of the PHY can be powered off and that's what I meant here. The
> > LTSSM state would be preserved by the MAC layer, whose context is
> > always retained.
> >
> > The only problem is that, CLKREQ# is routed to an Always-on-Domain
> > (AON) inside the SoC. So when the endpoint asserts CLKREQ#, AON
> > wakes up the SoC and later the PCIe controller driver turns ON the
> > PHY. But by that time, the L1ss exit latency would've elapsed,
> > causing LDn.
> >
> > > L2 and L3 both correspond to the downstream device being in D3cold
> > > (PCIe r7.0, sec 5.3.2), so I assume this is a reset as far as the
> > > device is concerned, and we need all the delays associated with
> > > reset and the D3cold -> D0 transition.
> > >
> > > > This latency violation leads to an L1ss exit timeout, followed
> > > > by a Link Down (LDn) condition during resume. This LDn can crash
> > > > the OS if the endpoint hosts the RootFS, and for other types of
> > > > devices, it may result in a full device reset/recovery.
> > >
> > > What does "L1SS exit timeout" mean in PCIe terms? Is there some
> > > event (Message, interrupt, etc) that is triggered by the timeout?
> >
> > By 'L1ss exit timeout' I meant the failure to move to L0 state post
> > L1.2 exit. During L1.2 exit, the endpoint expects the refclk and
> > common mode voltage to be restored within the negotiated time. Per
> > spec, r7.0, sec 5.5.3.3.1, Exit from L1.2:
> >
> > ```
> > Next state is L1.0 after waiting for TPOWER_ON
> >
> > * Common mode is permitted to be established passively during L1.0,
> > and actively during Recovery. In order to ensure common mode has
> > been established, the Downstream Port must maintain a timer, and the
> > Downstream Port must continue to send TS1 training sequences until a
> > minimum of TCOMMONMODE has elapsed since the Downstream Port has
> > started transmitting TS1 training sequences and has detected
> > electrical idle exit on any Lane of the configured Link.
> > ```
> >
> > So if this condition is not satisfied, then the link would move to
> > the LDn state and that's the only event triggered to the OS.
> >
> > > > So to ensure that the client drivers can properly handle this
> > > > scenario, let them know about this platform limitation by
> > > > setting the 'pci_host_bridge::broken_l1ss_resume' flag.
> > >
> > > I don't see how this means L1SS is broken. If the device is
> > > effectively reset, of course we can't go from L1.x to L0 because
> > > we didn't start from L1.x.
> >
> > From the OS perspective, the link would still be in L1ss and not
> > expected to move to L2/L3 during suspend/resume, since that
> > transition is controlled by the OS itself. But when the OS resumes,
> > the link would go to LDn state and it can only be brought back to
> > L0, after a complete reset.
>
> Thanks for the background. It would help a lot if I had more of a
> hardware background!
>
> Does L1.2 have to meet the advertised L1 Exit Latency? I assume maybe
> it does because I don't see an exception for L1.x or any exit
> latencies advertised in the L1 PM Substates Capability.
>
As per my understanding, 'L1 Exit Latency' only covers ASPM L1 state, not L1ss.
Because, 'L1 Exit Latency' field exists even before L1 PM Substates got
introduced in r3.1. So it doesn't cover L1.2 exit latency.
> Regardless, I'd be kind of surprised if *any* system could meet an
> L1.2 exit latency from a system suspend situation where PHY power is
> removed. On ACPI systems, the OS doesn't know how to remove PHY
> power, so I don't think that situation can happen unless firmware
> is involved in the suspend.
>
Yes, you are right. Even for systems turning off the PHY completely, they should
have some mechanism to detect the CLKREQ# assert and turn ON the PHY within the
expected time. On our Qcom platforms, we do have some co-processors handling
this even before the OS wakesup. But support for that co-processor is currently
not available in upstream and we don't know when it is going to be added. Until
then, we only have one option to not put the link to L1ss during suspend and
keep the devices into D3Cold to achieve the SoC low power state.
> Maybe that's part of why pm_suspend_via_firmware() exists. What if
> native host drivers just called pm_set_suspend_via_firmware()? After
> all, if they support suspend, they're doing things that are done by
> firmware on other systems.
No, that would be inappropriate. pm_set_suspend_via_firmware() is supposed to
be called only when the firmware is invoked at the end of suspend. If OS handles
everything and not the firmware, there is no need to invoke this API.
- Mani
--
மணிவண்ணன் சதாசிவம்