Re: [PATCH v3 0/4] PCI: dwc: Rework the error handling of dw_pcie_wait_for_link() API

From: Niklas Cassel

Date: Fri Jan 09 2026 - 11:21:42 EST


Hello Mani,

On Wed, Jan 07, 2026 at 01:52:57PM +0100, Niklas Cassel wrote:
> On Mon, Jan 05, 2026 at 05:11:42PM +0530, Manivannan Sadhasivam wrote:
> > On Fri, Jan 02, 2026 at 01:01:02PM +0100, Niklas Cassel wrote:
> > > On Tue, Dec 30, 2025 at 08:37:31PM +0530, Manivannan Sadhasivam via B4 Relay wrote:
> >
> > What could be happening here is that since the endpoint is physically connected
> > to the bus, the receiver gets detected during Detect.Active state and LTSSM
> > enters the Polling state. I think the reason why it ended up staying in
> > Poll.Compliance could be due to (as per the spec):
> >
> > a. Not all Lanes from the predetermined set of Lanes from above have
> > detected an exit from Electrical Idle since entering Polling.Active.
> >
> > b. Any Lane that detected a Receiver during Detect received eight consecutive
> > TS1 Ordered Sets (or their complement) with the Lane and Link numbers set to
> > PAD, the Compliance Receive bit (bit 4 of Symbol 5) is 1b, and the Loopback bit
> > (bit 2 of Symbol 5) is 0b that the Compliance Receive bit (bit 4 of Symbol 5) is
> > set.
> >
> > So this is perfectly legal from endpoint perspective.
> >
> > > Perhaps a Kconfig or module param? Suggestions?
> > >
> >
> > There is a DIRECT_POLCOMP_TO_DETECT bit (bit 9) in DBI SD_CONTROL2 register.
> > This bit will ensure that the LTSSM will not stuck in Poll.Compliance and will
> > return back to Detect state. Could you set it on the EP before starting LTSSM
> > and see if it helps?
>
> I will test and get back to you.

Looking at the databook, it appears that the SD_CONTROL2 register only exists
if CX_RAS_DES_ENABLE, and the register is located in RAS_DES capability.

RK3588 implements the RAS_DES capability, so it can set that bit, but most
likely there are some platforms that do not.


Anyway, I tried the following patch:

diff --git a/drivers/pci/controller/dwc/pcie-designware-host.c b/drivers/pci/controller/dwc/pcie-designware-host.c
index c30a2ed324cd..73d3d4bc1886 100644
--- a/drivers/pci/controller/dwc/pcie-designware-host.c
+++ b/drivers/pci/controller/dwc/pcie-designware-host.c
@@ -584,6 +584,7 @@ int dw_pcie_host_init(struct dw_pcie_rp *pp)
struct device_node *np = dev->of_node;
struct pci_host_bridge *bridge;
int ret;
+ int ras_cap;

raw_spin_lock_init(&pp->lock);

@@ -670,6 +671,15 @@ int dw_pcie_host_init(struct dw_pcie_rp *pp)
if (ret)
goto err_remove_edma;

+#define SD_CONTROL2_REG 0xa4
+ ras_cap = dw_pcie_find_rasdes_capability(pci);
+ if (ras_cap) {
+ u32 val;
+ val = dw_pcie_readl_dbi(pci, ras_cap + SD_CONTROL2_REG);
+ val |= BIT(9);
+ dw_pcie_writel_dbi(pci, ras_cap + SD_CONTROL2_REG, val);
+ }
+
if (!dw_pcie_link_up(pci)) {
ret = dw_pcie_start_link(pci);
if (ret)


And now, every second or third boot, LTSSM is no longer in POLL_COMPLIANCE,
instead of every second or third boot, LTSSM is now always in:
[ 2.298107] rockchip-dw-pcie a40000000.pcie: Link failed to come up. LTSSM: POLL_ACTIVE


I did comment out goto err_stop_link if dw_pcie_wait_for_link(), so I can dump
LTSSM afterwards, when this happens:

[ 2.297916] rockchip-dw-pcie a40000000.pcie: Link failed to come up. LTSSM: POLL_ACTIVE

Then I do:

# cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
POLL_ACTIVE (0x02)
# cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
POLL_ACTIVE (0x02)
# cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
DETECT_QUIET (0x00)
# cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
POLL_ACTIVE (0x02)
# cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
POLL_ACTIVE (0x02)
# cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
POLL_ACTIVE (0x02)
# cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
POLL_ACTIVE (0x02)
# cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
DETECT_QUIET (0x00)
# cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
DETECT_QUIET (0x00)
# cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
POLL_ACTIVE (0x02)
# cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
POLL_ACTIVE (0x02)
# cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
POLL_ACTIVE (0x02)
# cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
DETECT_ACT (0x01)
# cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
DETECT_QUIET (0x00)
# cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
POLL_ACTIVE (0x02)
# cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
POLL_ACTIVE (0x02)
# cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
POLL_ACTIVE (0x02)
# cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
DETECT_QUIET (0x00)
# cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
DETECT_QUIET (0x00)
# cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
DETECT_QUIET (0x00)
# cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
POLL_ACTIVE (0x02)
# cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
DETECT_QUIET (0x00)
# cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
POLL_ACTIVE (0x02)
# cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
DETECT_QUIET (0x00)
# cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
DETECT_QUIET (0x00)
# cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
POLL_ACTIVE (0x02)
# cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
DETECT_QUIET (0x00)
# cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
POLL_ACTIVE (0x02)
# cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
POLL_ACTIVE (0x02)
# cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
POLL_ACTIVE (0x02)
# cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
DETECT_QUIET (0x00)
# cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
DETECT_QUIET (0x00)
# cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
POLL_ACTIVE (0x02)
# cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
POLL_ACTIVE (0x02)
# cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
DETECT_ACT (0x01)
# cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
POLL_ACTIVE (0x02)
# cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
POLL_ACTIVE (0x02)
# cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
POLL_ACTIVE (0x02)
# cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
DETECT_QUIET (0x00)

So it appears that after setting the DIRECT_POLCOMP_TO_DETECT bit,
instead of LTSSM being stuck in POLL_COMPLIANCE, LTSSM seems to
jump between DETECT_QUIET / DETECT_ACT / POLL_ACTIVE.


And just like before, as soon as I do the configfs writes on the EP board
to start the link:

# cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
L0 (0x11)
# cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
L0 (0x11)

LTSSM transitions out of compliance, and rescan finds my device.


So I don't think that setting the DIRECT_POLCOMP_TO_DETECT bit will
help us PCIe endpoint developers to continue with the workflow where we
can simply do a rescan on the host after starting the link training on
the EP.

Back to finding another alternative. Kconfig? module param? Suggestions?


Kind regards,
Niklas