Re: [PATCH 2/2] genirq: fasteoi resends interrupt on concurrent invoke

From: Marc Zyngier
Date: Wed May 31 2023 - 03:01:01 EST


On Tue, 30 May 2023 22:38:48 +0100,
James Gowans <jgowans@xxxxxxxxxx> wrote:
>
> Update the generic handle_fasteoi_irq to cater for the case when the
> next interrupt comes in while the previous handler is still running.
> Currently when that happens the irq_may_run() early out causes the next
> IRQ to be lost. Change the behaviour to mark the interrupt as pending
> and re-send the interrupt when handle_fasteoi_irq sees that the pending
> flag has been set. This is largely inspired by handle_edge_irq.
>
> Generally it should not be possible for the next interrupt to arrive
> while the previous handler is still running: the next interrupt should
> only arrive after the EOI message has been sent and the previous handler
> has returned.

There is no such message with LPIs. I pointed that out previously.

> However, there is a race where if the interrupt affinity
> is changed while the previous handler is running, then the next
> interrupt can arrive at a different CPU while the previous handler is
> still running. In that case there will be a concurrent invoke and the
> early out will be taken.
>
> For example:
>
> CPU 0 | CPU 1
> -----------------------------|-----------------------------
> interrupt start |
> handle_fasteoi_irq | set_affinity(CPU 1)
> handler |
> ... | interrupt start
> ... | handle_fasteoi_irq -> early out
> handle_fasteoi_irq return | interrupt end
> interrupt end |
>
> This issue was observed specifically on an arm64 system with a GIC-v3
> handling MSIs; GIC-v3 uses the handle_fasteoi_irq handler. The issue is
> that the global ITS is responsible for affinity but does not know
> whether interrupts are pending/running, only the CPU-local redistributor
> handles the EOI. Hence when the affinity is changed in the ITS, the new
> CPU's redistributor does not know that the original CPU is still running
> the handler.

Similar to your previous patch, you don't explain *why* the interrupt
gets delivered when it is an LPI, and not for any of the other GICv3
interrupt types. That's an important point.

>
> Implementation notes:
>
> It is believed that it's NOT necessary to mask the interrupt in
> handle_fasteoi_irq() the way that handle_edge_irq() does. This is
> because handle_edge_irq() caters for controllers which are too simple to
> gate interrupts from the same source, so the kernel explicitly masks the
> interrupt if it re-occurs [0].
>
> [0] https://lore.kernel.org/all/bf94a380-fadd-8c38-cc51-4b54711d84b3@xxxxxxxxxx/
>
> Suggested-by: Liao Chang <liaochang1@xxxxxxxxxx>
> Signed-off-by: James Gowans <jgowans@xxxxxxxxxx>
> Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> Cc: Marc Zyngier <maz@xxxxxxxxxx>
> Cc: KarimAllah Raslan <karahmed@xxxxxxxxxx>
> Cc: Yipeng Zou <zouyipeng@xxxxxxxxxx>
> Cc: Zhang Jianhua <chris.zjh@xxxxxxxxxx>
> ---
> kernel/irq/chip.c | 15 ++++++++++++++-
> 1 file changed, 14 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c
> index 49e7bc871fec..42f33e77c16b 100644
> --- a/kernel/irq/chip.c
> +++ b/kernel/irq/chip.c
> @@ -692,8 +692,15 @@ void handle_fasteoi_irq(struct irq_desc *desc)
>
> raw_spin_lock(&desc->lock);
>
> - if (!irq_may_run(desc))
> + /*
> + * When an affinity change races with IRQ delivery, the next interrupt
> + * can arrive on the new CPU before the original CPU has completed
> + * handling the previous one. Mark it as pending and return EOI.
> + */
> + if (!irq_may_run(desc)) {
> + desc->istate |= IRQS_PENDING;
> goto out;
> + }
>
> desc->istate &= ~(IRQS_REPLAY | IRQS_WAITING);
>
> @@ -715,6 +722,12 @@ void handle_fasteoi_irq(struct irq_desc *desc)
>
> cond_unmask_eoi_irq(desc, chip);
>
> + /*
> + * When the race descibed above happens, this will resend the interrupt.
> + */
> + if (unlikely(desc->istate & IRQS_PENDING))
> + check_irq_resend(desc, false);
> +
> raw_spin_unlock(&desc->lock);
> return;
> out:

While I'm glad that you eventually decided to use the resend mechanism
instead of spinning on the "old" CPU, I still think imposing this
behaviour on all users without any discrimination is wrong.

Look at what it does if an interrupt is a wake-up source. You'd
pointlessly requeue the interrupt (bonus points if the irqchip doesn't
provide a HW-based retrigger mechanism).

I still maintain that this change should only be applied for the
particular interrupts that *require* it, and not as a blanket change
affecting everything under the sun. I have proposed such a change in
the past, feel free to use it or roll your own.

In the meantime, I strongly oppose this change as proposed.

Thanks,

M.

--
Without deviation from the norm, progress is not possible.