Re: [PATCH 3/3] irqchip: renesas-rzg2l: Fix irq storm with edge trigger detection for TINT

From: Marc Zyngier
Date: Thu Sep 21 2023 - 13:51:43 EST


On Tue, 19 Sep 2023 18:06:54 +0100,
Biju Das <biju.das.jz@xxxxxxxxxxxxxx> wrote:
>
> Hi Marc Zyngier,
>
> > Subject: Re: [PATCH 3/3] irqchip: renesas-rzg2l: Fix irq storm with edge
> > trigger detection for TINT
> >
> > On Tue, 19 Sep 2023 17:32:05 +0100,
> > Biju Das <biju.das.jz@xxxxxxxxxxxxxx> wrote:
> >
> > [...]
> >
> > > > So you mean that you *already* lose interrupts across a disable
> > > > followed by an enable? I'm slightly puzzled...
> > >
> > > There is no interrupt lost at all.
> > >
> > > Currently this patch addresses 2 issues.
> > >
> > > Scenario 1: Extra interrupt when we select TINT source on enable_irq()
> > >
> > > Getting an extra interrupt, when client drivers calls enable_irq()
> > > during probe()/resume(). In this case, the irq handler on the Client
> > > driver just clear the interrupt status bit.
> > >
> > > Issue 2: IRQ storm when we select TINT source on enable_irq()
> > >
> > > Here as well, we are getting an extra interrupt, when client drivers
> > > calls enable_irq() during probe() and this Interrupts getting
> > > generated infinitely, when the client driver calls disable_irq() in
> > > irq handler and in in work queue calling enable_irq().
> >
> > How do you know this is a spurious interrupt?
>
> We have PMOD on RZ/G2L SMARC EVK. So I connected it to GPIO pin
> and other end to ground. During the boot, I get an interrupt
> even though there is no high to low transition, when the IRQ is setup
> in the probe(). From this it is a spurious interrupt.

That doesn't really handle my question. At the point of enabling the
interrupt and consuming the edge (which is what this patch does), how
do you know you can readily discard this signal? This is a genuine
question.

Spurious interrupts at boot are common. The HW resets in a funky,
unspecified state, and it's SW's job to initialise it before letting
other agents in the system use interrupts.

>
> > For all you can tell, you are
> > just consuming an edge. I absolutely don't buy this workaround, because you
> > have no context that allows you to discriminate between a real spurious
> > interrupt and a normal interrupt that lands while the interrupt line was
> > masked.
> >
> > > Currently we are not loosing interrupts, but we are getting additional
> > > Interrupt(phantom) which is causing the issue.
> >
> > If you get an interrupt at probe time in the endpoint driver, that's
> > probably because the device is not in a quiescent state when the interrupt
> > is requested. And it is probably this that needs addressing.
>
> Any pointer for addressing this issue?

Nothing but the most basic stuff: you should make sure that the
interrupt isn't enabled before you can actually handle it, and triage
it as spurious.

M.

--
Without deviation from the norm, progress is not possible.