Re: [PATCH v2] PCI: pciehp: Fix hotplug on Catlow Lake with unreliable PME status

From: Kuppuswamy Sathyanarayanan

Date: Tue Feb 17 2026 - 11:55:28 EST


Hi Rafael,

On 2/13/2026 3:14 PM, Kuppuswamy Sathyanarayanan wrote:
> On Intel Catlow Lake platforms, PCH PCIe root ports do not reliably
> update PME status registers (PME Status and PME Requester_ID in the
> Root Status register) during D3hot to D0 transitions, even though PME
> interrupts are delivered correctly.
>
> This issue manifests during PCIe hotplug operations as follows:
>
> 1. After a hot-remove event, the PCIe port transitions to D3hot and
> the hotplug interrupt enable (HPIE) flag is disabled as the port
> enters low power state.
>
> 2. When a hot-add occurs while the port is in D3hot, a PME interrupt
> fires as expected to wake the port.
>
> 3. However, the PME interrupt handler finds the PME_Status and
> PME_Requester_ID registers unpopulated, preventing identification
> of which device triggered the PME. The handler returns IRQ_NONE,
> leaving the port in D3hot.
>
> 4. Because the port remains in D3hot with HPIE disabled, the hotplug
> driver ignores the hot-add event, resulting in the newly inserted
> device not being recognized.
>
> The PME interrupt delivery mechanism itself works correctly;
> interrupts arrive reliably. The problem is purely the missing status
> register updates. Verification via IOSF-SideBand (IOSF-SB) backdoor
> reads confirms that these registers remain empty when the PME
> interrupt fires. Neither BIOS nor kernel code is clearing these
> registers.
>
> This issue is present in all steppings of Catlow Lake PCH and affects
> customers in production deployments. A public hardware errata document
> is not yet available.
>
> Work around this issue by disabling runtime PM for affected ports,
> keeping them in D0 during runtime operation. This ensures hotplug
> events are handled via direct interrupts rather than relying on
> unreliable PME-based wakeup.
>
> During system suspend/resume, PCIe ports are resumed unconditionally
> when coming out of system sleep due to DPM_FLAG_SMART_SUSPEND set by
> pcie_portdrv_probe(), and pciehp re-enables interrupts and checks slot
> occupation status during resume.
>
> The quirk is applied only to Catlow PCH PCIe root ports (device IDs
> 0x7a30 through 0x7a4b). Catlow CPU PCIe ports are not affected as
> they are not hotplug-capable.
>
> Suggested-by: Lukas Wunner <lukas@xxxxxxxxx>
> Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@xxxxxxxxxxxxxxx>
> ---

Could you please review this patch and let us know if calling
pm_runtime_disable() from a PCI quirk is acceptable?

The quirk keeps specific Catlow Lake PCH PCIe root ports in D0 to
work around a hardware bug where PME status registers are not reliably
updated during D3hot to D0 transitions, causing hotplug events to be
missed.

System suspend/resume is unaffected as DPM_FLAG_SMART_SUSPEND ensures
ports are resumed unconditionally and pciehp checks slot occupation
on resume.


>
> Changes since v1:
> * Removed hack in hotplug driver and disabled runtime PM on affected ports.
> * Fixed the commit log and comments accordingly.
>
> drivers/pci/quirks.c | 49 ++++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 49 insertions(+)
>
> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> index 280cd50d693b..779cd65b1a8a 100644
> --- a/drivers/pci/quirks.c
> +++ b/drivers/pci/quirks.c
> @@ -6340,3 +6340,52 @@ static void pci_mask_replay_timer_timeout(struct pci_dev *pdev)
> DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_GLI, 0x9750, pci_mask_replay_timer_timeout);
> DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_GLI, 0x9755, pci_mask_replay_timer_timeout);
> #endif
> +
> +/*
> + * Intel Catlow Lake PCH PCIe root ports have a hardware issue where
> + * PME status registers (PME Status and PME Requester_ID in Root Status)
> + * are not reliably updated during D3hot to D0 transitions, even though
> + * PME interrupts are delivered correctly.
> + *
> + * When a hotplug event occurs while the port is in D3hot, the PME
> + * interrupt fires but the status registers remain empty. This prevents
> + * the PME handler from identifying the event source, leaving the port
> + * in D3hot and causing the hotplug driver to miss the event.
> + *
> + * Disable runtime PM to keep these ports in D0, ensuring hotplug events
> + * are handled via direct interrupts.
> + */
> +static void quirk_intel_catlow_pcie_no_pme_wakeup(struct pci_dev *dev)
> +{
> + pm_runtime_disable(&dev->dev);
> + pci_info(dev, "Catlow PCH port: PME status unreliable, disabling runtime PM\n");
> +}
> +/* Apply quirk to Catlow Lake PCH root ports (0x7a30 - 0x7a4b) */
> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x7a30, quirk_intel_catlow_pcie_no_pme_wakeup);
> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x7a31, quirk_intel_catlow_pcie_no_pme_wakeup);
> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x7a32, quirk_intel_catlow_pcie_no_pme_wakeup);
> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x7a33, quirk_intel_catlow_pcie_no_pme_wakeup);
> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x7a34, quirk_intel_catlow_pcie_no_pme_wakeup);
> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x7a35, quirk_intel_catlow_pcie_no_pme_wakeup);
> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x7a36, quirk_intel_catlow_pcie_no_pme_wakeup);
> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x7a37, quirk_intel_catlow_pcie_no_pme_wakeup);
> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x7a38, quirk_intel_catlow_pcie_no_pme_wakeup);
> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x7a39, quirk_intel_catlow_pcie_no_pme_wakeup);
> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x7a3a, quirk_intel_catlow_pcie_no_pme_wakeup);
> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x7a3b, quirk_intel_catlow_pcie_no_pme_wakeup);
> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x7a3c, quirk_intel_catlow_pcie_no_pme_wakeup);
> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x7a3d, quirk_intel_catlow_pcie_no_pme_wakeup);
> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x7a3e, quirk_intel_catlow_pcie_no_pme_wakeup);
> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x7a3f, quirk_intel_catlow_pcie_no_pme_wakeup);
> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x7a40, quirk_intel_catlow_pcie_no_pme_wakeup);
> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x7a41, quirk_intel_catlow_pcie_no_pme_wakeup);
> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x7a42, quirk_intel_catlow_pcie_no_pme_wakeup);
> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x7a43, quirk_intel_catlow_pcie_no_pme_wakeup);
> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x7a44, quirk_intel_catlow_pcie_no_pme_wakeup);
> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x7a45, quirk_intel_catlow_pcie_no_pme_wakeup);
> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x7a46, quirk_intel_catlow_pcie_no_pme_wakeup);
> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x7a47, quirk_intel_catlow_pcie_no_pme_wakeup);
> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x7a48, quirk_intel_catlow_pcie_no_pme_wakeup);
> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x7a49, quirk_intel_catlow_pcie_no_pme_wakeup);
> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x7a4a, quirk_intel_catlow_pcie_no_pme_wakeup);
> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x7a4b, quirk_intel_catlow_pcie_no_pme_wakeup);

--
Sathyanarayanan Kuppuswamy
Linux Kernel Developer