Re: Race between MMIO writes and level IRQs

From: Mark Rutland
Date: Thu Jun 06 2019 - 06:09:31 EST

On Thu, Jun 06, 2019 at 11:53:05AM +0200, Marc Gonzalez wrote:
> Hello everyone,
> There's something about interrupts I have never quite understood,
> which I'd like to clear up once and for all. What I'm about to write
> will probably sound trivial to anyone's who's already figured it out,
> but I need to walk through it.
> Consider a device, living on some peripheral bus, with an interrupt
> line flowing from the device into some kind of interrupt controller.
> I.e. there are two "communication channels"
> 1) the peripheral bus, and 2) the "out-of-band" interrupt line.
> At some point, the device requires the CPU to do $SOMETHING. It sends
> a signal over the interrupt line (either a pulse for edge interrupts,
> or keeping the line high for level interrupts). After some time, the
> CPU will "take the interrupt", mask all(?) interrupts, and jump to the
> proper interrupt service routine (ISR).
> The CPU does whatever it's supposed to do, and then needs to inform
> the device that "yes, the work is done, stop pestering me". Typically,
> this is done by writing some value to one of the device's registers.
> AFAICT, this is the part where things can go wrong:
> The CPU issues the magic MMIO write, which will take some time to reach
> the device over the peripheral bus. Meanwhile, the device maintains the
> IRQ signal (assuming a level interrupt). Once the CPU leaves the ISR, the
> framework will unmask IRQs. If the write has not yet reached the device,
> the CPU will be needlessly interrupted again.

Even if the write *has* reached the device, there can be a delay before
the device de-sassert the interrupt line, and there can be a subsequent
delay for each irqchip between the device and the CPU re-sampling the
line and propagating this to its output.

Thus it's always possible to take a spurious IRQ after an IRQ handler
returns, even if the device de-asserted its interrupt line immediately.

Linux drivers are expected to be resilient to spurious IRQs.

> Basically, there's a race between the MMIO write and the IRQ unmasking.
> We'd like to be able to guarantee that the MMIO write is complete before
> unmasking interrupts, right?

This really depends on what you're doing. In many cases it's fine for
the write to complete later, given cases like the above.

> Some people use memory barriers, but my understanding is that this is
> not sufficient. The memory barrier may guarantee that the MMIO write
> has left the CPU "domain", but not that it has reached the device.

IIUC, memory barriers cannot necessarily ensure that a write has reached
a device.

> So it looks like the only surefire way to guarantee that the MMIO write
> has reached the device is to read the value back from the device?

A read ensures that the device has accepted the prior write, but does
not necessarily guarantee that it has changed internal state or
triggered any externally visible effects.

What exactly you need to do will depend on what you're trying to achieve
and the behaviour of the specific device.