Re: [PATCH] spi: spi-geni-qcom: Fix NULL pointer access in geni_spi_isr

From: Stephen Boyd
Date: Fri Dec 11 2020 - 20:33:44 EST


Quoting Doug Anderson (2020-12-10 17:51:53)
> Hi,
>
> On Thu, Dec 10, 2020 at 5:39 PM Stephen Boyd <swboyd@xxxxxxxxxxxx> wrote:
> >
> > Quoting Doug Anderson (2020-12-10 17:30:17)
> > > On Thu, Dec 10, 2020 at 5:21 PM Stephen Boyd <swboyd@xxxxxxxxxxxx> wrote:
> > > >
> > > > Yeah and so if it comes way later because it timed out then what's the
> > > > point of calling synchronize_irq() again? To make the completion
> > > > variable set when it won't be tested again until it is reinitialized?
> > >
> > > Presumably the idea is to try to recover to a somewhat usable state
> > > again? We're not rebooting the machine so, even though this transfer
> > > failed, we will undoubtedly do another transfer later. If that
> > > "abort" interrupt comes way later while we're setting up the next
> > > transfer we'll really confuse ourselves.
> >
> > The interrupt handler just sets a completion variable. What does that
> > confuse?
>
> The interrupt handler sees a "DONE" interrupt. If we've made it far
> enough into setting up the next transfer that "cur_xfer" has been set
> then it might do more, no?

I thought it saw a cancel/abort EN bit?

if (m_irq & M_CMD_CANCEL_EN)
complete(&mas->cancel_done);
if (m_irq & M_CMD_ABORT_EN)
complete(&mas->abort_done)

and only a DONE bit if a transfer happened.

>
>
> > > I guess you could go the route of adding a synchronize_irq() at the
> > > start of the next transfer, but I'd rather add the overhead in the
> > > exceptional case (the timeout) than the normal case. In the normal
> > > case we don't need to worry about random IRQs from the past transfer
> > > suddenly showing up.
> > >
> >
> > How does adding synchronize_irq() at the end guarantee that the abort is
> > cleared out of the hardware though? It seems to assume that the abort is
> > pending at the GIC when it could still be running through the hardware
> > and not executed yet. It seems like a synchronize_irq() for that is
> > wishful thinking that the irq is merely pending even though it timed
> > out and possibly never ran. Maybe it's stuck in a write buffer in the
> > CPU?
>
> I guess I'm asserting that if a full second passed (because we timed
> out) and after that full second no interrupts are pending then the
> interrupt will never come. That seems a reasonable assumption to me.
> It seems hard to believe it'd be stuck in a write buffer for a full
> second?
>

Ok, so if we don't expect an irq to come in why are we calling
synchronize_irq()? I'm lost.