Re: Disabling an interrupt in the handler locks the system up

From: Mason
Date: Sat Oct 22 2016 - 19:10:45 EST


On 22/10/2016 13:37, Marc Zyngier wrote:

> Mason wrote:
>
>> In my mental picture of interrupts (which is obviously so
>> incomplete as to be wrong) interrupts are a way for hardware
>> to tell the CPU that they urgently need the CPU's attention.
>
> That's how the CPU interprets it, but this is even more basic than
> that, see below.
>
>> Obviously, the hardware being idle (line high) is not an urgent
>> matter which interests the CPU. Likewise, I'm not sure the CPU
>> cares that the hardware is busy (line low). It seems to me the
>> interesting event from the CPU's perspective is when the
>> hardware completes a "task" (transition from low to high).
>
> There is no such thing as "busy" when it comes to interrupts. An
> interrupt signals the CPU that some device-specific condition has been
> satisfied. It could be "I've received a packet" or "Battery is about to
> explode", depending if the device is a network controller or a
> temperature sensor. The interrupt doesn't describe the process that
> leads to that condition (packet being received or temperature rising),
> but the condition itself.
>
> In your cases, as the device seems to do some form of processing
> (you're talking about task completion), then the interrupt seems to
> describe exactly this ("I'm done").

The device is a graphics engine, which can be programmed to perform
some operation on one or several frame buffers stored in memory.
It outputs its state (idle vs busy) on interrupt line 23.

>> So I had originally configured the interrupt as IRQ_TYPE_EDGE_RISING.
>> (There is an edge detection block in the irqchip, but the HW designer
>> warned me that at low frequencies, it is possible to "miss" some edges,
>> and we should prefer level triggers if possible.)
>
> Level and edge are not interchangeable. They do describe very different
> thing:
>
> - Level indicates a persistent state, which implies that the device
> needs to be serviced so that this condition can be cleared (the UART
> has received a character, and won't be able to received another until
> it has been read by the CPU). Once the device has been serviced and
> that condition cleared, it will lower its interrupt line.

With this graphics engine, there is nothing the CPU can do to
change what the engine outputs on the interrupt line:

When the graphics engine is idle, the line remains high, forever.
When the graphics engine is busy, the line remains low, until
all operations have been performed (engine idle).

All the CPU can do is mask the interrupt line at the interrupt
controller, as far as I understand.

> - Edge is indicative of an event having occurred ("I'm done") that
> doesn't require any action from the CPU. Because the device can
> continue its life without being poked by the CPU, it can continue
> delivering interrupts even if the first one hasn't been serviced.
> Being edge triggered, the signals get coalesced into a single
> interrupt. For example, the temperature sensor will say "Temperature
> rising" multiple times before the battery explodes, and it is the
> CPU's job to go and read the sensor to find out by how much it has
> risen.
>
> If your device only sends a pulse, then it is edge triggered, and it
> should be treated as such, no matter what your HW guy is saying. This
> usually involves looking at the device to find out how many times the
> interrupt has been generated (assuming the device is some kind of
> processing element). Of course, this is racy (interrupts can still be
> generated whilst you're processing them), and you should design your
> interrupt handler to take care of the possible race.

It is clear that the block does not send a pulse on the
interrupt line.

For reasons I don't understand, Linux didn't hang when I set
the IRQ type to IRQ_TYPE_EDGE_RISING, so it seemed better
than locking up the system.

I'm also fuzzy on what purpose the edge detector is supposed
to serve... I had the impression is what supposed to "capture"
an edge, to turn it into a level?

> So, to make it short: find out how your device works, and configure
> your interrupt controller in a similar way. Write your device driver
> with the interrupt policy in mind (state vs event). Keep it simple.

Thomas said "We describe the level which is raising the interrupt".
But I'm not sure I want the state "engine is busy" to raise an
interrupt. "engine is idle" makes more sense. But you said it's
stupid to set IRQ_TYPE_LEVEL_HIGH... /me confused

Maybe the fact that disable_irq locks the system up is an orthogonal
issue that needs to be fixed anyway.

Regards.