Re: [PATCH] thunderbolt: Fix PCIe device enumeration with delayed rescan

From: Mario Limonciello

Date: Wed Feb 18 2026 - 14:22:16 EST

On 2/12/26 6:34 PM, Mario Limonciello wrote:

++

On 2/12/2026 1:34 AM, Mika Westerberg wrote:

[+Cc AMD folks]

On Thu, Feb 12, 2026 at 08:07:54AM +0100, Mika Westerberg wrote:

On Thu, Feb 12, 2026 at 12:16:03PM +0800, AceLan Kao wrote:

Disable runpm on 62:02.0, then we have 83:00.0 and its downstream port
84:01.0 and 85:00.0, and then the tbt storage is recognized.

Got troubles with mutt, my reply got rejected :(

Using gmail and copy/paste the content below again.

Okay.

Okay that means there is nothing wrong with the PCIe tunnel itself it's
just that the PCIe side either does not get the PME or does not see that
the PCIe link becomes active (e.g the PCIe Downstream Port runtime suspends
itself before the link status changes).

PME work so that there is wake first on Intel it's GPE that wakes up the
root port and then PCIe stack wakes up devices and then the PME message is
sent to the root complex.

If you do this on Intel host do you see the same?

Intel host exhibits another symptom, I think the root cause is different.

Plug in the dock, and then plug in the tbt storage to the dock one by
one, both storage can be detected.

Plug both tbt storage to the dock, and then plug in the dock to the
machine, only one tbt storage appears. In rare chance, both tbt
storages show up, but most of the time, only one tbt storage is detected.
In this case, none of disable runpm, rescan, or lspci work. So, it's
most likely another issue.

By "detected" you mean the TB device is not detected on TB bus? Or it is
detected on TB bus but creating PCIe tunnel does not make the content
visible on PCIe bus?

You can check this from dmesg, the driver logs if it sees the plug event.
Or run tblist (from tbtools) and see if the device is listed.

I suspect former and in that case it is likely a PD/retimer related issue
rather than software. I see these once in a while especially with new
hardware where the PD firmare is not yet stabilized. If there is TB/USB4
link then all is working from TB/USB4 perspective.

Right at that point the PCIe Downstream Port probably is already back
runtime suspended.

Here you could try this:

# echo 250 > /sys/bus/pci/devices/0000:62:02.0/power/autosuspend_delay

No luck, I enlarged the number to 1000, but still can't recognize the
second tbt storage.

What about -1?

That's effectively same as blocking runtime PM completely so should work.

I tried to wake up the PCIe ports in the beginning of tb_tunnel_pci() and
it works.

+       pdev = pci_get_domain_bus_and_slot(0, 0x62, PCI_DEVFN(0x02, 0));
+       if (pdev) {
+               if (pdev->dev.power.runtime_status == RPM_SUSPENDED)
+                       pm_runtime_get_sync(&pdev->dev);
+               pci_dev_put(pdev);
+       }

But I can't find a generic way to get the bus and slot number, and
would you consider this a feasible approach?

No I don't want any (more) PCI related hacks in the driver.

This is not a TB issue, it's a PCIe issue. I suspect it has something to do
with handling PME/GPE on AMD side. Essentially when runtime PM is blocked
the PCIe hotplug driver notices the tunnel just fine. When it is runtime
suspended (e.g D3) it should send PME to the root complex that the brings
the topology up so that the hotplug driver can detect the presence but this
does not seem to happen.

If you enable dynamic debugging of pciehp, do you see anything happening
when you create the second PCIe tunnel? I suspect not.

From what I see above I agree this does seem like a PME delivery issue of some sort. Any chance you can put this on a PCIe analyzer and confirm whether the PME was ever sent?

We could reproduce the behavior on our side using a USB4 hub and two NVME disks. Keeping the root port for tunneling at D0 (turning off runtime PM) avoided it.

In cross referencing an Intel system (Intel Core Ultra 200V) we saw that it couldn't reproduce. But looking at differences we noticed the root port for tunneling on this system stayed at D0 the entire time. Is that expected?