Re: [PATCH 1/2] spi: spi-geni-qcom: Fix geni_spi_isr() NULL dereference in timeout case
From: Doug Anderson
Date: Wed Dec 16 2020 - 17:44:05 EST
Hi,
On Mon, Dec 14, 2020 at 6:29 PM Stephen Boyd <swboyd@xxxxxxxxxxxx> wrote:
>
> Here's a shortened version:
>
> CPU0 CPU1
> ---- ----
> setup_fifo_xfer()
> geni_se_setup_m_cmd()
> <hardware starts transfer>
> <transfer completes in hardware>
> <hardware sets M_RX_FIFO_WATERMARK_EN in m_irq>
> ...
> handle_fifo_timeout()
> spin_lock_irq(mas->lock)
> mas->cur_xfer = NULL
> geni_se_cancel_m_cmd()
> spin_unlock_irq(mas->lock)
>
> geni_spi_isr()
> spin_lock(mas->lock)
> if (m_irq & M_RX_FIFO_WATERMARK_EN)
> geni_spi_handle_rx()
> mas->cur_xfer NULL dereference!
>
> Two CPUs also don't really matter but I guess that's fine.
OK, replaced it with your version.
> > Specifically it should be noted that the RX/TX interrupts are still
> > shown asserted even when a CANCEL/ABORT interrupt has asserted.
>
> Can we have 'TL;DR: Seriously delayed interrupts for RX/TX can lead to
> timeout handling setting mas->cur_xfer to NULL.'?
Sure, added this. ...but made the super important change that "tl;dr"
is more conventionally lower case. :-P
> > Let's check for the NULL transfer in the TX and RX cases.
>
> and reset the watermark or clear out the fifo respectively to put the
> hardware back into a sane state.
Sure.
> > @@ -396,6 +402,17 @@ static void geni_spi_handle_rx(struct spi_geni_master *mas)
> > if (rx_last_byte_valid && rx_last_byte_valid < 4)
> > rx_bytes -= bytes_per_fifo_word - rx_last_byte_valid;
> > }
> > +
> > + /* Clear out the FIFO and bail if nowhere to put it */
> > + if (mas->cur_xfer == NULL) {
>
> I think if (!mas->cur_xfer) is more kernel idiomatic, but sure.
I've been yelled at both ways, but changed it to your way here.
> > + for (i = 0; i < words; i++)
>
> while (i++ < DIV_ROUND_UP(rx_bytes, bytes_per_fifo_word))
> readl(se->base + SE_GENI_RX_FIFOn);
Sure, that's fine. I was marginally worried that the compiler
wouldn't know it could optimize the test and would do the divide every
time, but I guess that's pretty dang unlikely and also not a place we
really care about optimizing a lot. I'm also not a huge fan of
relying on loop counters being initted at the start of the function,
but I guess it's OK. Changed to your syntax.
-Doug