Re: [PATCH] irq: fasteoi handler re-runs on concurrent invoke

From: liaochang (A)
Date: Mon May 22 2023 - 23:16:26 EST




在 2023/5/2 16:43, Gowans, James 写道:
> Hi Marc and Thomas,
>
> On Tue, 2023-04-18 at 12:56 +0200, James Gowans wrote:
>>> static inline irq_hw_number_t irqd_to_hwirq(struct irq_data *d)
>>> diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c
>>> index 49e7bc871fec..73546ba8bc43 100644
>>> --- a/kernel/irq/chip.c
>>> +++ b/kernel/irq/chip.c
>>> @@ -692,8 +692,11 @@ void handle_fasteoi_irq(struct irq_desc *desc)
>>> raw_spin_lock(&desc->lock);
>>> - if (!irq_may_run(desc))
>>> + if (!irq_may_run(desc)) {
>>> + if (irqd_needs_resend_when_in_progress(&desc->irq_data))
>>> + check_irq_resend(desc, true);
>>> goto out;
>>> + }
>>
>>
>> This will run check_irq_resend() on the *newly affined* CPU, while the old
>> one is still running the original handler. AFAICT what will happen is:
>> check_irq_resend
>> try_retrigger
>> irq_chip_retrigger_hierarchy
>> its_irq_retrigger
>> ... which will cause the ITS to *immediately* re-trigger the IRQ. The
>> original CPU can still be running the handler in that case.
>>
>> If that happens, consider what will happen in check_irq_resend:
>> - first IRQ comes in, successflly runs try_retrigger and sets IRQS_REPLAY.
>> - it is *immediately* retriggered by ITS, and because the original handler
>> on the other CPU is still running, comes into check_irq_resend again.
>> - check_irq_resend now observes that IRQS_REPLAY is set and early outs.
>> - No more resends, the IRQ is still lost. :-(
>>
>> Now I admit the failure mode is getting a bit pathological: two re-
>> triggers while the original handler is still running, but I was able to
>> hit this on my test machine by intentionally slowing
>> the handler down by a few dozen micros. Should we cater for this?
>>
>> I can see two possibilities:
>> - tweak check_irq_resend() to not early-out in this case but to keep re-
>> triggering until it eventually runs.
>> - move the check_irq_resend to only happen later, *after* the original
>> handler has finished running. This would be very similar to what I
>> suggested in my original patch, except instead of running a do/while loop,
>> the code would observe that the pending flag was set again and run
>> check_irq_resend.

Hi, James and Marc,

After studying your discussions, I list some requirements need to satify for
the final practical solution:

1. Use the GIC to maintain the unhandled LPI.
2. Do not change the semantics of set_irq_affinity, which means that the interrupt
action must be performed on the new CPU when the next interrupt occurs after a
successful set_irq_affinity operation.
3. Minimize the cost, especially to other tasks running on CPUs, which means avoid
a do/while loop on the original CPU and repeatedly resend interrupt on the new CPU.

Based on these requirements and Linux v6.3 rev, I propose the following hack:

diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c
index 49e7bc871fec..1b49518b19bd 100644
--- a/kernel/irq/chip.c
+++ b/kernel/irq/chip.c
@@ -692,8 +692,14 @@ void handle_fasteoi_irq(struct irq_desc *desc)

raw_spin_lock(&desc->lock);

- if (!irq_may_run(desc))
+ /*
+ * Ack another interrupt from the same source can occurs on new
+ * CPU even before the first one is handled on original CPU.
+ */
+ if (!irq_may_run(desc)) {
+ desc->istate |= IRQS_PENDING;
goto out;
+ }

desc->istate &= ~(IRQS_REPLAY | IRQS_WAITING);

@@ -715,6 +721,8 @@ void handle_fasteoi_irq(struct irq_desc *desc)

cond_unmask_eoi_irq(desc, chip);

+ check_irq_resend(desc, true);
+
raw_spin_unlock(&desc->lock);
return;
out:

Looking forward to your feedbacks, thanks.

>>
>> I'm also wondering what will happen for users who don't have the
>> chip->irq_retrigger callback set and fall back to the tasklet
>> via irq_sw_resend()... Looks like it will work fine. However if we do my
>> suggestion and move check_irq_resend to the end of handle_fasteoi_irq then
>> the tasklet will be scheduled on the old CPU again, which may be sub-
>> optimal.
>
> Just checking to see if you've had a chance to consider these
> issues/thoughts, and if/how they should be handled?
> I'm still tending towards saying that the check_irq_resend() should run
> after handle_irq_event() and the IRQS_PENDING flag should be wrangled to
> decide whether or not to resend.
>
> I just don't know if having the tasklet scheduled and run on the original
> CPU via irq_sw_resend() would be problematic or not. In general it
> probably won't but in the CPU offlining case.... maybe? I realise that for
> GIC-v3 the tasklet won't be used because GIC has chip->irq_retrigger
> callback defined - I'm just thinking in general here, especially so
> assuming we drop the new IRQD_RESEND_WHEN_IN_PROGRESS flag).
>
> Thoughts?
>
> I can put together a PoC and test it along with Yipeng from Huawei if you
> think it sounds reasonable.
>
> JG