Re: [PATCH v5 04/10] pci: don't disable msi/msix at shutdown

From: Michael S. Tsirkin
Date: Sun Apr 12 2015 - 04:52:25 EST


On Fri, Apr 10, 2015 at 01:33:04PM -0500, Bjorn Helgaas wrote:
> Hi Michael,
>
> On Sun, Mar 29, 2015 at 05:04:11PM +0200, Michael S. Tsirkin wrote:
> > This partially reverts commit d52877c7b1afb8c37ebe17e2005040b79cb618b0:
> > "pci/irq: let pci_device_shutdown to call pci_msi_shutdown v2"
> >
> > It's un-necessary now that we disable msi at start, and it actually
> > turns out to cause problems: some device drivers don't register a level
> > interrupt handler when they detect msi/msix capability, switching off
> > msi while device is going causes device to assert a level interrupt
> > which is never de-asserted, causing a kernel hang.
> >
> > In particular, this was observed with virtio.
>
> I'm not questioning that this hang happens, but would you mind outlining
> *how* it happens in a little more detail? I'm not an IRQ expert, so I
> expected an "irq %d: nobody cared" message or something similar. It seems
> like a kernel hang is a pretty severe way to deal with an unexpected
> interrupt.

True. I intend to look into how this interacts with spurious
interrupt detection some more. Avoiding spurious interrupts
seems like a worthwhile goal in any case, right?

It seems clear how this will cause hangs when noirqdebug is set (later leads
to softlockup detected messages, or crash if softlockup_panic=1 is set).

> Is virtio the only way the hang could happen, or is it just coincidence
> that it was involved?

Well, you need a driver which doesn't handle level IRQs
when it enables MSI. virtio is one such driver.


> It'd be really nice if we could reference the bug report here. I think you
> said the original report was private. Can we open a kernel.org bugzilla
> that contains just the public information?

Ulrich Obergfell did most of the work on reproducing this,
Fam Zheng did most debugging, so I'd like one of them
to do this, so they get the appropriate credit.
Fam, Ulrich?

> > Cc: Yinghai Lu <yhlu.kernel.send@xxxxxxxxx>
> > Cc: Ulrich Obergfell <uobergfe@xxxxxxxxxx>
> > Cc: Rusty Russell <rusty@xxxxxxxxxxxxxxx>
> > Reported-by: Fam Zheng <famz@xxxxxxxxxx>
> > Signed-off-by: Michael S. Tsirkin <mst@xxxxxxxxxx>
> > ---
> > drivers/pci/pci-driver.c | 2 --
> > 1 file changed, 2 deletions(-)
> >
> > diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
> > index 3cb2210..38a602c 100644
> > --- a/drivers/pci/pci-driver.c
> > +++ b/drivers/pci/pci-driver.c
> > @@ -450,8 +450,6 @@ static void pci_device_shutdown(struct device *dev)
> >
> > if (drv && drv->shutdown)
> > drv->shutdown(pci_dev);
> > - pci_msi_shutdown(pci_dev);
> > - pci_msix_shutdown(pci_dev);
> >
> > #ifdef CONFIG_KEXEC
> > /*
> > --
> > MST
> >
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/