Re: AW: [BUG] Thunderbolt runtime resume during PCIe removal causes IRQ warning and shutdown failure.

From: Bjorn Helgaas

Date: Thu Apr 02 2026 - 18:21:11 EST


[+cc Thunderbolt & pciehp folks, initial report of system poweroff
failure at
https://lore.kernel.org/all/AM9PR10MB42316BF3E59B29E1EA3E5600B756A@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx]

On Fri, Mar 27, 2026 at 05:28:28PM +0000, Georg Klima wrote:
> Hi Bjorn,
>
> Upstream without nvidia, more debug, same issue with aspm default:

Thanks for this test with an upstream kernel (6.19.10). Complete
dmesg log was attached to
https://lore.kernel.org/all/AM9PR10MB4231D6536D271E1F5A81F3D1B757A@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/

Linux only requests control of PME, AER, hotplug, etc if Linux
supports ASPM and MSI. "pcie_aspm=off" means Linux doesn't support
ASPM, so it doesn't request control:

--- dmesg_aspm_off.txt
+++ dmesg_actual.txt
- acpi PNP0A08:01: _OSC: OS supports [ExtendedConfig Segments MSI EDR HPX-Type3]
+ acpi PNP0A08:01: _OSC: OS supports [ExtendedConfig ASPM ClockPM Segments MSI EDR HPX-Type3]
- acpi PNP0A08:01: _OSC: not requesting OS control; OS requires [ExtendedConfig ASPM ClockPM MSI]
+ acpi PNP0A08:01: _OSC: OS now controls [PCIeHotplug SHPCHotplug PME AER PCIeCapability LTR DPC]

I suspect the issue is related to those services, not to ASPM itself.
Booting with "pcie_port_pm=off" might be a more targeted workaround.

You have this topology:

0000:80:1b.4: [8086:7f44] PCIe Root Port to [bus 88-d8]
0000:88:00.0: [8086:5780] PCIe Switch Upstream Port (JHL9580 Thunderbolt 5)

and the first thing I see in the 6.19.10 log is this, which makes me
think we put the Thunderbolt controller at 88:00.0 into D3 and are
trying to bring it back to D0 but it took too long, so we can't access
downstream devices like b1:00.0:

Mar 27 18:08:40 fedora kernel: pcieport 0000:80:1b.4: Data Link Layer Link Active not set in 100 msec
Mar 27 18:08:40 fedora kernel: pcieport 0000:80:1b.4: pciehp: Slot(25): Card not present
Mar 27 18:08:40 fedora kernel: xhci_hcd 0000:b1:00.0: Controller not ready at resume -19
Mar 27 18:08:40 fedora kernel: ------------[ cut here ]------------
Mar 27 18:08:40 fedora kernel: xhci_hcd 0000:b1:00.0: PCI post-resume error -19!
Mar 27 18:08:40 fedora kernel: thunderbolt 0000:8a:00.0: interrupt for TX ring 0 is already enabled
Mar 27 18:08:40 fedora kernel: tb_ring_start+0x149/0x330 [thunderbolt]
Mar 27 18:08:40 fedora kernel: tb_ctl_start+0x1b/0xc0 [thunderbolt]
Mar 27 18:08:40 fedora kernel: tb_domain_runtime_resume+0x19/0x40 [thunderbolt]
Mar 27 18:08:40 fedora kernel: __rpm_callback+0x48/0x1f0
Mar 27 18:08:40 fedora kernel: rpm_callback+0x6d/0x80
Mar 27 18:08:40 fedora kernel: rpm_resume+0x4ab/0x6d0