[PATCH 2/2] genirq: fasteoi resends interrupt on concurrent invoke

From: James Gowans
Date: Tue May 30 2023 - 17:41:01 EST


Update the generic handle_fasteoi_irq to cater for the case when the
next interrupt comes in while the previous handler is still running.
Currently when that happens the irq_may_run() early out causes the next
IRQ to be lost. Change the behaviour to mark the interrupt as pending
and re-send the interrupt when handle_fasteoi_irq sees that the pending
flag has been set. This is largely inspired by handle_edge_irq.

Generally it should not be possible for the next interrupt to arrive
while the previous handler is still running: the next interrupt should
only arrive after the EOI message has been sent and the previous handler
has returned. However, there is a race where if the interrupt affinity
is changed while the previous handler is running, then the next
interrupt can arrive at a different CPU while the previous handler is
still running. In that case there will be a concurrent invoke and the
early out will be taken.

For example:

CPU 0 | CPU 1
-----------------------------|-----------------------------
interrupt start |
handle_fasteoi_irq | set_affinity(CPU 1)
handler |
... | interrupt start
... | handle_fasteoi_irq -> early out
handle_fasteoi_irq return | interrupt end
interrupt end |

This issue was observed specifically on an arm64 system with a GIC-v3
handling MSIs; GIC-v3 uses the handle_fasteoi_irq handler. The issue is
that the global ITS is responsible for affinity but does not know
whether interrupts are pending/running, only the CPU-local redistributor
handles the EOI. Hence when the affinity is changed in the ITS, the new
CPU's redistributor does not know that the original CPU is still running
the handler.

Implementation notes:

It is believed that it's NOT necessary to mask the interrupt in
handle_fasteoi_irq() the way that handle_edge_irq() does. This is
because handle_edge_irq() caters for controllers which are too simple to
gate interrupts from the same source, so the kernel explicitly masks the
interrupt if it re-occurs [0].

[0] https://lore.kernel.org/all/bf94a380-fadd-8c38-cc51-4b54711d84b3@xxxxxxxxxx/

Suggested-by: Liao Chang <liaochang1@xxxxxxxxxx>
Signed-off-by: James Gowans <jgowans@xxxxxxxxxx>
Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
Cc: Marc Zyngier <maz@xxxxxxxxxx>
Cc: KarimAllah Raslan <karahmed@xxxxxxxxxx>
Cc: Yipeng Zou <zouyipeng@xxxxxxxxxx>
Cc: Zhang Jianhua <chris.zjh@xxxxxxxxxx>
---
kernel/irq/chip.c | 15 ++++++++++++++-
1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c
index 49e7bc871fec..42f33e77c16b 100644
--- a/kernel/irq/chip.c
+++ b/kernel/irq/chip.c
@@ -692,8 +692,15 @@ void handle_fasteoi_irq(struct irq_desc *desc)

raw_spin_lock(&desc->lock);

- if (!irq_may_run(desc))
+ /*
+ * When an affinity change races with IRQ delivery, the next interrupt
+ * can arrive on the new CPU before the original CPU has completed
+ * handling the previous one. Mark it as pending and return EOI.
+ */
+ if (!irq_may_run(desc)) {
+ desc->istate |= IRQS_PENDING;
goto out;
+ }

desc->istate &= ~(IRQS_REPLAY | IRQS_WAITING);

@@ -715,6 +722,12 @@ void handle_fasteoi_irq(struct irq_desc *desc)

cond_unmask_eoi_irq(desc, chip);

+ /*
+ * When the race descibed above happens, this will resend the interrupt.
+ */
+ if (unlikely(desc->istate & IRQS_PENDING))
+ check_irq_resend(desc, false);
+
raw_spin_unlock(&desc->lock);
return;
out:
--
2.25.1