Re: [PATCH] gpio: pca953x: Survive spurious interrupts
From: Andy Shevchenko
Date: Wed Oct 07 2020 - 10:03:12 EST
On Wed, Oct 7, 2020 at 4:20 PM Marc Zyngier <maz@xxxxxxxxxx> wrote:
> On 2020-10-07 14:10, Andy Shevchenko wrote:
> > On Wed, Oct 7, 2020 at 3:09 PM Marc Zyngier <maz@xxxxxxxxxx> wrote:
> >> On 2020-10-07 13:02, Andy Shevchenko wrote:
> >> > On Wed, Oct 7, 2020 at 12:49 PM Linus Walleij
> >> > <linus.walleij@xxxxxxxxxx> wrote:
> >> >> On Mon, Oct 5, 2020 at 4:02 PM Marc Zyngier <maz@xxxxxxxxxx> wrote:
> >> >>
> >> >> > The pca953x driver never checks the result of irq_find_mapping(),
> >> >> > which returns 0 when no mapping is found. When a spurious interrupt
> >> >> > is delivered (which can happen under obscure circumstances), the
> >> >> > kernel explodes as it still tries to handle the error code as
> >> >> > a real interrupt.
> >> >> >
> >> >> > Handle this particular case and warn on spurious interrupts.
> >> >> >
> >> >> > Signed-off-by: Marc Zyngier <maz@xxxxxxxxxx>
> >> >
> >> > Wait, doesn't actually [1] fix the reported issue?
> >>
> >> Not at all.
> >>
> >> > Marc, can you confirm this?
> >> >
> >> > [1]: e43c26e12dd4 ("gpio: pca953x: Fix uninitialized pending variable")
> >>
> >> Different bug, really. If an interrupt is *really* pending, and no
> >> mapping established yet, feeding the result of irq_find_mapping() to
> >> handle_nested_irq() will lead to a panic.
> >
> > I don't understand. We have plenty of drivers doing exactly the way
> > without checking this returned code.
>
> I'm sure we do. Most driver code is buggy as hell, but I don't see that
> as a reason to cargo-cult the crap. The API is crystal clear that it can
> return 0 for no mapping, and 0 isn't a valid interrupt.
Yes, and the problem here is that we got this response from IRQ core,
which we shouldn't.
> > What circumstances makes the mapping be absent?
>
> Other bugs in the system ([1]), spurious interrupts (which can *always*
> happen).
>
> > Shouldn't we rather change this:
> >
> > girq->handler = handle_simple_irq;
> > to this:
> > girq->handler = handle_bad_irq;
> > ?
>
> I don't understand what you are trying to achieve with that, apart from
> maybe breaking the driver. The right way to handle spurious interrupts
> is by telling the core code that the interrupt wasn't handled, and to
> let
> the spurious interrupt code do its magic.
handle_bad_irq() is exactly for handling spurious IRQs as far as we
believe documentation. So, by default the driver assigns (should
assign) handle_bad_irq() to all IRQs as a default handler. If, by any
chance, we got it, we already have a proper handler in place. The read
handler is assigned whenever the IRQ core is called to register it (by
means of ->irq_set_type() callback). My understanding that GPIO IRQ
drivers are designed (should be designed) in this way. The approach
will make us sure that we don't have spurious interrupts with assigned
handlers.
> [1] https://lore.kernel.org/r/20201005111443.1390096-1-maz@xxxxxxxxxx
--
With Best Regards,
Andy Shevchenko