Re: [PATCH] irq: fasteoi handler re-runs on concurrent invoke

From: Gowans, James
Date: Thu May 25 2023 - 06:05:07 EST


On Tue, 2023-05-23 at 11:16 +0800, liaochang (A) wrote:
> Hi, James and Marc,
>
> After studying your discussions, I list some requirements need to satify for
> the final practical solution:
>
> 1. Use the GIC to maintain the unhandled LPI.
> 2. Do not change the semantics of set_irq_affinity, which means that the interrupt
> action must be performed on the new CPU when the next interrupt occurs after a
> successful set_irq_affinity operation.
> 3. Minimize the cost, especially to other tasks running on CPUs, which means avoid
> a do/while loop on the original CPU and repeatedly resend interrupt on the new CPU.

Seems reasonable to me.
>
> Based on these requirements and Linux v6.3 rev, I propose the following hack:
>
> diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c
> index 49e7bc871fec..1b49518b19bd 100644
> --- a/kernel/irq/chip.c
> +++ b/kernel/irq/chip.c
> @@ -692,8 +692,14 @@ void handle_fasteoi_irq(struct irq_desc *desc)
>
> raw_spin_lock(&desc->lock);
>
> - if (!irq_may_run(desc))
> + /*
> + * Ack another interrupt from the same source can occurs on new
> + * CPU even before the first one is handled on original CPU.
> + */
> + if (!irq_may_run(desc)) {
> + desc->istate |= IRQS_PENDING;
> goto out;
> + }

Wording may need a slight tweak, especially pointing out why PENDING is
set.
>
> desc->istate &= ~(IRQS_REPLAY | IRQS_WAITING);
>
> @@ -715,6 +721,8 @@ void handle_fasteoi_irq(struct irq_desc *desc)
>
> cond_unmask_eoi_irq(desc, chip);
>
> + check_irq_resend(desc, true);
> +

I'm pretty sure the second param, "inject", must not be true. As is this
will *always* resend the interrupt after every interrupt. Infinite
interrupt loop! (did you test this?)
I've changed it to false and tested, seems good to me.

I'd suggest adding a comment here which pairs with the comment above
explaining why this is being called here. Also perhaps adding:
if (unlikely((desc->istate & IRQS_PENDING))
...so that in general this function call will be skipped over.

If you want I'm happy to make these tweaks and post as V2?
There are also some other comments I'm keen to add to the flag enums to
make it a bit clearer what some of the flags mean.

Marc, what are your thoughts on this approach?
The main (only?) downside I see compared to your suggestion is that in the
irq_sw_resend() case the tasklet will be scheduled on the original CPU. As
you mentioned in a previous reply, perhaps we shouldn't bother thinking
about the sw_resend case now as only GIC is impacted. In future if
necessary the sw_resend implementation could be tweaked to schedule the
tasklet on the CPU which the interrupt is affined to.

JG

> raw_spin_unlock(&desc->lock);
> return;
> out:
>
> Looking forward to your feedbacks, thanks.