Re: [PATCH] spi: dw: fix race between transfer IRQ handler and timeout handler
From: Mark Brown
Date: Mon May 25 2026 - 10:34:22 EST
On Fri, May 22, 2026 at 05:57:27PM +0800, Peng Yang wrote:
> dw_spi_transfer_handler() can race with dw_spi_handle_err() on SMP.
> When an IRQ-based transfer times out, the error path resets the chip
> while the IRQ handler is still accessing the FIFO on another CPU.
> This causes bus errors on platforms where empty FIFO access is illegal.
> Fix by adding a spinlock around FIFO access and chip reset to make them
> in serial.
> @@ -240,7 +243,9 @@ static irqreturn_t dw_spi_transfer_handler(struct dw_spi *dws)
> * have the TXE IRQ flood at the final stage of the transfer.
> */
> if (irq_status & DW_SPI_INT_TXEI) {
> + spin_lock_irqsave(&dws->buf_lock, flags);
> dw_writer(dws);
> + spin_unlock_irqrestore(&dws->buf_lock, flags);
I am having an incredibly hard time convincing myself that these very
tight locking windows are safe, especially given that we're using a
value for irq_status we took before the lock.
> static inline void dw_spi_abort(struct spi_controller *ctlr)
> {
> struct dw_spi *dws = spi_controller_get_devdata(ctlr);
> + unsigned long flags;
>
> if (dws->dma_mapped)
> dws->dma_ops->dma_stop(dws);
>
> + spin_lock_irqsave(&dws->buf_lock, flags);
> dw_spi_reset_chip(dws);
> + spin_unlock_irqrestore(&dws->buf_lock, flags);
> }
What prevents an abort happening between checking the interrupt status
and trying to do a read or write, and/or is it safe for the readers and
writers to run after we reset the hardware? We didn't reset any driver
state AFAICT, just the hardware.
Attachment:
signature.asc
Description: PGP signature