Re: [PATCH 1/1] PCI: dwc: Use regular interrupt instead of chained
From: Bjorn Helgaas
Date: Thu Jun 29 2023 - 16:59:17 EST
On Thu, Jun 29, 2023 at 04:42:07PM -0400, Radu Rendec wrote:
> On Thu, 2023-06-29 at 14:57 -0500, Bjorn Helgaas wrote:
> > On Thu, Jun 29, 2023 at 02:30:19PM -0400, Radu Rendec wrote:
> > > The DesignWare PCIe host driver uses a chained interrupt to demultiplex
> > > the downstream MSI interrupts. On Qualcomm SA8540P Ride, enabling both
> > > pcie2a and pcie3a at the same time can create an interrupt storm where
> > > the parent interrupt fires continuously, even though reading the PCIe
> > > host registers doesn't identify any child MSI interrupt source. This
> > > effectively locks up CPU0, which spends all the time servicing these
> > > interrupts.
> > >
> > > This is a clear example of how bypassing the interrupt core by using
> > > chained interrupts can be very dangerous if the hardware misbehaves.
> > >
> > > Convert the driver to use a regular interrupt for the demultiplex
> > > handler. This allows the interrupt storm detector to detect the faulty
> > > interrupt and disable it, allowing the system to run normally.
> >
> > There are many other users of irq_set_chained_handler_and_data() in
> > drivers/pci/controller/. Should they be similarly converted? If not,
> > how do we decide which need to use irq_set_chained_handler_and_data()
> > and which do not?
>
> According to Thomas Gleixner, yes. Obviously I don't want to put words
> in his mouth, but I think that's the gist of what he said in a reply to
> an RFC patch that I sent a few weeks ago:
> https://lore.kernel.org/all/877csohcll.ffs@tglx/
Is it's a bug in pcie-designware-host.c, and it's also a bug in the
other callers, we should fix them all.
But you do imply that there's some DesignWare hardware issue involved,
too, so I guess it's possible the other drivers don't have an issue
and/or actually require the chained IRQs. That's why I asked how we
should decide.
> > > -static void dw_pcie_free_msi(struct dw_pcie_rp *pp)
> > > +static void __dw_pcie_free_msi(struct dw_pcie_rp *pp, u32 num_ctrls)
> > > {
> > > u32 ctrl;
> > >
> > > - for (ctrl = 0; ctrl < MAX_MSI_CTRLS; ctrl++) {
> > > + for (ctrl = 0; ctrl < num_ctrls; ctrl++) {
> > > if (pp->msi_irq[ctrl] > 0)
> > > - irq_set_chained_handler_and_data(pp->msi_irq[ctrl],
> > > - NULL, NULL);
> > > + free_irq(pp->msi_irq[ctrl], pp);
> > > }
> > >
> > > irq_domain_remove(pp->msi_domain);
> > > irq_domain_remove(pp->irq_domain);
> > > }
> > >
> > > +#define dw_pcie_free_msi(pp) __dw_pcie_free_msi(pp, MAX_MSI_CTRLS)
> >
> > What is the benefit of the dw_pcie_free_msi() macro?
>
> It allows me to add the num_ctrls parameter to the corresponding
> function (now renamed to __dw_pcie_free_msi()) without forcing all the
> existing call sites to send MAX_MSI_CTRLS explicitly.
>
> I needed that extra parameter to avoid duplicating the tear down code
> on the (new) error path in dw_pcie_msi_init() - see below.
>
> > > static void dw_pcie_msi_init(struct dw_pcie_rp *pp)
> > > {
> > > struct dw_pcie *pci = to_dw_pcie_from_pp(pp);
> > > @@ -361,9 +353,16 @@ static int dw_pcie_msi_host_init(struct dw_pcie_rp *pp)
> > > return ret;
> > >
> > > for (ctrl = 0; ctrl < num_ctrls; ctrl++) {
> > > - if (pp->msi_irq[ctrl] > 0)
> > > - irq_set_chained_handler_and_data(pp->msi_irq[ctrl],
> > > - dw_chained_msi_isr, pp);
> > > + if (pp->msi_irq[ctrl] > 0) {
> > > + ret = request_irq(pp->msi_irq[ctrl], dw_pcie_msi_isr, 0,
> > > + dev_name(dev), pp);
> > > + if (ret) {
> > > + dev_err(dev, "Failed to request irq %d: %d\n",
> > > + pp->msi_irq[ctrl], ret);
> > > + __dw_pcie_free_msi(pp, ctrl);
>
> This is where I'm using the extra parameter. If we fail to request an
> interrupt, we need to free all the other interrupts that we have
> requested so far, to leave everything in a clean state. But we can't
> use MAX_MSI_CTRLS with __dw_pcie_free_msi() and rely on the check there
> because there may be extra interrupts that we haven't requested *yet*
> and we would attempt to free them.
Makes sense, thanks.
Bjorn