Re: [PATCH] spi: spi-geni-qcom: Fix NULL pointer access in geni_spi_isr
From: Stephen Boyd
Date: Wed Dec 09 2020 - 22:17:50 EST
Quoting Doug Anderson (2020-12-03 08:40:46)
> I would guess that if "mas->cur_xfer" is NULL then
> geni_spi_handle_rx() should read all data in the FIFO and throw it
> away and geni_spi_handle_tx() should set SE_GENI_TX_WATERMARK_REG to
> 0. NOTE: I _think_ that with the synchronize_irq() I'm suggesting
> above we'll avoid this case, but it never hurts to be defensive.
>
>
> Does that all make sense? So the summary is that instead of your patch:
Can we get a CPU diagram describing the race and scenario where this
happens? Something like:
CPU0 CPU1
---- ----
setup_fifo_xfer()
spin_lock_irq(&mas->lock);
spin_unlock_irq(&mas->lock);
mas->cur_xfer = xfer
...
<IRQ>
geni_spi_isr()
geni_spi_handle_rx()
<NULL deref boom explosion!>
But obviously this example diagram is incorrect and some timeout happens
instead? Sorry, I'm super lazy and don't want to read many paragraphs of
text. :) I'd rather have a diagram like above that clearly points out
the steps taken to the NULL pointer deref.
>
> 1. Add synchronize_irq() at the start and end of
> handle_fifo_timeout(). Not under lock.
>
> 2. In geni_spi_handle_rx(), check for NULL "mas->cur_xfer". Read all
> data in the FIFO (don't cap at rx_rem_bytes), but throw it away.
>
> 3. In geni_spi_handle_tx(), check for NULL "mas->cur_xfer". Don't
> write any data. Just write 0 to SE_GENI_TX_WATERMARK_REG.
>
> I think #1 is the real fix, but #2 and #3 will avoid crashes in case
> there's another bug somewhere.
>
Aren't 2 and 3 papering over some weird problem though where irqs are
coming in unexpectedly?