Re: [PATCH] spi: dw: fix race between transfer IRQ handler and timeout handler

From: Mark Brown

Date: Mon May 25 2026 - 10:34:22 EST


On Fri, May 22, 2026 at 05:57:27PM +0800, Peng Yang wrote:
> dw_spi_transfer_handler() can race with dw_spi_handle_err() on SMP.
> When an IRQ-based transfer times out, the error path resets the chip
> while the IRQ handler is still accessing the FIFO on another CPU.
> This causes bus errors on platforms where empty FIFO access is illegal.

> Fix by adding a spinlock around FIFO access and chip reset to make them
> in serial.


> @@ -240,7 +243,9 @@ static irqreturn_t dw_spi_transfer_handler(struct dw_spi *dws)
> * have the TXE IRQ flood at the final stage of the transfer.
> */
> if (irq_status & DW_SPI_INT_TXEI) {
> + spin_lock_irqsave(&dws->buf_lock, flags);
> dw_writer(dws);
> + spin_unlock_irqrestore(&dws->buf_lock, flags);

I am having an incredibly hard time convincing myself that these very
tight locking windows are safe, especially given that we're using a
value for irq_status we took before the lock.

> static inline void dw_spi_abort(struct spi_controller *ctlr)
> {
> struct dw_spi *dws = spi_controller_get_devdata(ctlr);
> + unsigned long flags;
>
> if (dws->dma_mapped)
> dws->dma_ops->dma_stop(dws);
>
> + spin_lock_irqsave(&dws->buf_lock, flags);
> dw_spi_reset_chip(dws);
> + spin_unlock_irqrestore(&dws->buf_lock, flags);
> }

What prevents an abort happening between checking the interrupt status
and trying to do a read or write, and/or is it safe for the readers and
writers to run after we reset the hardware? We didn't reset any driver
state AFAICT, just the hardware.

Attachment: signature.asc
Description: PGP signature