Re: [PATCH 1/2] PCI/portdrv: Add necessary delay for disabling hotplug events

From: Feng Tang
Date: Tue Feb 04 2025 - 21:46:52 EST


Hi Lukas,

Thanks for your review and suggestions!

On Tue, Feb 04, 2025 at 10:07:04AM +0100, Lukas Wunner wrote:
> On Tue, Feb 04, 2025 at 01:37:57PM +0800, Feng Tang wrote:
> > According to PCIe 6.1 spec, section 6.7.3.2, software need to wait at
> > least 1 second for the command-complete event, before resending the cmd
> > or sending a new cmd.
> >
> > Currently get_port_device_capability() sends slot control cmd to disable
> > PCIe hotplug interrupts without waiting for its completion and there was
> > real problem reported for the lack of waiting.
> >
> > Add the necessary wait to comply with PCIe spec. The waiting logic refers
> > existing pcie_poll_cmd().
> [...]
> > --- a/drivers/pci/pcie/portdrv.c
> > +++ b/drivers/pci/pcie/portdrv.c
> > @@ -230,8 +260,7 @@ static int get_port_device_capability(struct pci_dev *dev)
> > * Disable hot-plug interrupts in case they have been enabled
> > * by the BIOS and the hot-plug service driver is not loaded.
> > */
> > - pcie_capability_clear_word(dev, PCI_EXP_SLTCTL,
> > - PCI_EXP_SLTCTL_CCIE | PCI_EXP_SLTCTL_HPIE);
> > + pcie_disable_hp_interrupts_early(dev);
> > }
>
> The language of the code comment is a bit confusing in that it
> says the hot-plug driver may not be "loaded". This sounds like
> it could be modular. But it can't. It's always built-in.

I'm not sure what problem this piece of code try to avoid, maybe
something simliar to the irq storm isseu as mentioned in the 2/2 patch?
The code comments could be about the small time window between this
point and the loading of pciehp driver, which will request_irq and
enable hotplug interrupt again.

> So I think what is really meant here is that the driver may be
> *disabled* in the config, i.e. CONFIG_HOTPLUG_PCI_PCIE=n.
>
> Now if CONFIG_HOTPLUG_PCI_PCIE=n, you don't need to observe the
> Command Completed delay because the hotplug driver won't touch
> the Slot Control register afterwards. It's not compiled in.
>
> On the other hand if CONFIG_HOTPLUG_PCI_PCIE=y, you don't need
> to disable the CCIE and HPIE interrupt because the hotplug driver
> will handle them.
>
> So I think the proper solution here is to make the write to the
> Slot Control register conditional on CONFIG_HOTPLUG_PCI_PCIE,
> like this:
>
> if (dev->is_hotplug_bridge &&
> (pci_pcie_type(dev) == PCI_EXP_TYPE_ROOT_PORT ||
> - pci_pcie_type(dev) == PCI_EXP_TYPE_DOWNSTREAM) &&
> - (pcie_ports_native || host->native_pcie_hotplug)) {
> - services |= PCIE_PORT_SERVICE_HP;
> + pci_pcie_type(dev) == PCI_EXP_TYPE_DOWNSTREAM)) {
> + if (pcie_ports_native || host->native_pcie_hotplug)
> + services |= PCIE_PORT_SERVICE_HP;
>
> /*
> * Disable hot-plug interrupts in case they have been enabled
> * by the BIOS and the hot-plug service driver is not loaded.
> */
> - pcie_capability_clear_word(dev, PCI_EXP_SLTCTL,
> - PCI_EXP_SLTCTL_CCIE | PCI_EXP_SLTCTL_HPIE);
> + if (!IS_ENABLED(CONFIG_HOTPLUG_PCI_PCIE))
> + pcie_capability_clear_word(dev, PCI_EXP_SLTCTL,
> + PCI_EXP_SLTCTL_CCIE | PCI_EXP_SLTCTL_HPIE);

Yes, this could sovlve the original issue.

The problem I heard from BIOS developer is, they didn't expect to
receive 2 link control commands at almost the same time, which doesn't
comply to pcie spec, and normaly the handling of one command will take
some time, though not as long as 1 second.

Thanks,
Feng

> The above patch also makes sure the interrupts are quiesced if the
> platform didn't grant hotplug control to OSPM.
>
> Thanks,
>
> Lukas