Re: [PATCH v4 2/5] spi: spi-geni-qcom: Mo' betta locking

From: Stephen Boyd
Date: Thu Jun 18 2020 - 13:52:32 EST


Quoting Douglas Anderson (2020-06-18 08:06:23)
> If you added a bit of a delay (like a trace_printk) into the ISR for
> the spi-geni-qcom driver, you would suddenly start seeing some errors
> spit out. The problem was that, though the ISR itself held a lock,
> other parts of the driver didn't always grab the lock.
>
> One example race was this:
> CPU0 CPU1
> ---- ----
> spi_geni_set_cs()
> mas->cur_mcmd = CMD_CS;
> geni_se_setup_m_cmd(...)
> wait_for_completion_timeout(&xfer_done);
> <INTERRUPT>
> geni_spi_isr()
> complete(&xfer_done);
> <wakeup>
> pm_runtime_put(mas->dev);
> ... // back to SPI core
> spi_geni_transfer_one()
> setup_fifo_xfer()
> mas->cur_mcmd = CMD_XFER;
> mas->cur_cmd = CMD_NONE; // bad!
> return IRQ_HANDLED;
>
> Let's fix this. Before we start messing with hardware, we'll grab the
> lock to make sure that the IRQ handler from some previous command has
> really finished. We don't need to hold the lock unless we're in a
> state where more interrupts can come in, but we at least need to make
> sure the previous IRQ is done. This lock is used exclusively to
> prevent the IRQ handler and non-IRQ from stomping on each other. The
> SPI core handles all other mutual exclusion.
>
> As part of this, we change the way that the IRQ handler detects
> spurious interrupts. Previously we checked for our state variable
> being set to IRQ_NONE, but that was done outside the spinlock. We
> could move it into the spinlock, but instead let's just change it to
> look for the lack of any IRQ status bits being set. This can be done
> outside the lock--the hardware certainly isn't grabbing or looking at
> the spinlock when it updates its status register.
>
> It's possible that this will fix real (but very rare) errors seen in
> the field that look like:
> irq ...: nobody cared (try booting with the "irqpoll" option)
>
> NOTE: an alternate strategy considered here was to always make the
> complete() / spi_finalize_current_transfer() the very last thing in
> our IRQ handler. With such a change you could consider that we could
> be "lockless". In that case, though, we'd have to be very careful w/
> memory barriers so we made sure we didn't have any bugs with weakly
> ordered memory. Using spinlocks makes the driver much easier to
> understand.
>
> Fixes: 561de45f72bd ("spi: spi-geni-qcom: Add SPI driver support for GENI based QUP")
> Signed-off-by: Douglas Anderson <dianders@xxxxxxxxxxxx>
> ---

Reviewed-by: Stephen Boyd <swboyd@xxxxxxxxxxxx>