Re: PCI: MSI interrupts masked using prohibited method

From: Matthew Wilcox
Date: Thu Jul 17 2008 - 11:39:29 EST

On Thu, Jul 17, 2008 at 02:14:40PM +0100, David Vrabel wrote:
> Matthew Wilcox wrote:
> > David's clearly talking about devices which don't support mask bits.
> > For these devices, we call
> > msi_set_enable(entry->dev, !flag);
> >
> > which turns off the PCI_MSI_FLAGS_ENABLE bit. I'm not sure I see the
> > bit in the PCI spec that says devices are then allowed to ignore the
> > Interrupt Disable bit in the device control register, but I concede such
> > devices may be out there.
> From the PCI spec:
> "This bit disables the device/function from asserting INTx#. A value of
> 0 enables the assertion of its INTx# signal. A value of 1 disables the
> assertion of its INTx# signal. This bit?s state after RST# is 0. Refer
> to Section for control of MSI."
> So the interrupt disable bit is only for the line interrupts, and not MSI.

Yes it is, but we don't touch that bit at this time.

When we enable MSIs, we set the INTx disable bit (or at least do a write
to it ... as Krzysztof Halasa pointed out, not all devices implement
this bit). When we mask MSIs, we clear the MSI enable bit. So a device
conforming to PCI 2.3 has both INTx and MSI disabled. Unfortunately, a
device conforming to PCI 2.2 has MSI disabled and INTx implicitly
re-enabled. Oops.

> > We have some infrastructure for resending IRQs, so we could soft-mask an
> > MSI and resend it when the irq is unmasked, if it has triggered.
> Is this really necessary? Any PCI device with MSI that doesn't have the
> hardware MSI mask bits must have some sort of device specific
> handshaking for managing when MSIs can be generated.

Maybe so, but we don't control that. Here's the flow that leads to
the problem you've observed (note: only x86-64 analysed here, other
architectures may vary):

The mask_msi_irq() function is the heart of what's going on. This is
set to be the ->mask() function pointer in the MSI irq_chip struct.

The device generates an MSI interrupt.
Some magic happens in assembly, and the processor ends up in do_IRQ()
which calls generic_handle_irq(). Because it's an MSI IRQ,
handle_edge_irq() is called instead of __do_IRQ().

There are two opportunities for calling the ->mask() here, one is:

if (unlikely((desc->status & (IRQ_INPROGRESS | IRQ_DISABLED)) ||
!desc->action)) {
desc->status |= (IRQ_PENDING | IRQ_MASKED);
mask_ack_irq(desc, irq);

The other is:

if (unlikely(!action)) {

This is all in generic code which knows nothing about any device-specific
method of masking interrupts from the chip.

> Regardless, doesn't __do_IRQ() handle this already anyway?

This is a good thought, let's follow it through. What if we simply make
->mask a no-op for devices which don't support mask bits?

MSIs are 'edge triggered' interrupts. They're sent once and then
forgotten by the hardware (as opposed to level triggered which will
continue to be triggered until deasserted).

The first time through (assuming ->action is non-NULL ...), we won't
try to mask the irq. The second time through, IRQ_INPROGRESS will be set,
so we try to mask_ack_irq().

How about we simply don't ack the irq at this point? That should prevent
it being triggered again, right?

Working on a patch to do this now ...

Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours. We can't possibly take such
a retrograde step."
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at