Re: [PATCH 2/2] PCI: Disable PCIE hotplug interrupts early when msi is disabled

From: Feng Tang
Date: Wed Feb 05 2025 - 01:32:21 EST


On Tue, Feb 04, 2025 at 10:14:10AM +0100, Lukas Wunner wrote:
> On Tue, Feb 04, 2025 at 01:37:58PM +0800, Feng Tang wrote:
> > There was a irq storm bug when testing "pci=nomsi" case, and the root
> > cause is: 'nomsi' will disable MSI and let devices and root ports use
> > legacy INTX inerrupt, and likely make several devices/ports share one
> > interrupt. In the failure case, BIOS doesn't disable the PCIE hotplug
> > interrupts, and actually asserts the command-complete interrupt.
> > As MSI is disabled, ACPI initialization code will not enumerate root
> > port's PCIE hotplug capability, and pciehp service driver wont' be
> > enabled for the root port to handle that interrupt, later on when it is
> > shared and enabled by other device driver like NVME or NIC, the "nobody
> > care irq storm" happens.
> >
> > So disable the pcie hotplug CCIE/HPIE interrupt in early boot phase when
> > MSI is not enbaled.
>
> So I think this issue should go away if disabling the interrupt
> by portdrv is no longer conditional on
>
> (pcie_ports_native || host->native_pcie_hotplug)
>
> like I've just proposed here:
>
> https://lore.kernel.org/r/Z6HYuBDP6uvE1Sf4@xxxxxxxxx/
>
> ... in which case this patch won't be necessary. Can you confirm that?

Thanks for the suggestion! I will try to get the platform for test,
and report back.

As for the change,
+ if (!IS_ENABLED(CONFIG_HOTPLUG_PCI_PCIE))
+ pcie_capability_clear_word(dev, PCI_EXP_SLTCTL,
+ PCI_EXP_SLTCTL_CCIE | PCI_EXP_SLTCTL_HPIE);

The CONFIG_HOTPLUG_PCI_PCIE is always enabled on our platform and many
distros, I guess the check needs to be removed, which sees the 1 second
waiting again, and need the waiting logic in 1/2 patch?

Thanks,
Feng

>
> You can split the change I've proposed into two patches if you like.
>
> Thanks,
>
> Lukas