Re: [PATCH 2/5] dmaengine: sf-pdma: fix race between done and error interrupts
From: Frank Li
Date: Fri Feb 20 2026 - 15:27:13 EST
On Sat, Feb 21, 2026 at 03:43:54AM +0800, Max Hsu wrote:
> According to the FU540-C000 v1p5 [1] and FU740-C000 v1p7 [2] specs,
> when a DMA transaction error occurs, the hardware sets both the
> DONE and ERROR interrupt bits simultaneously.
Nit: need extra empty line between two paragraph.
> On SMP systems, this can cause the done_isr and err_isr to execute
> concurrently on different CPUs, leading to race conditions and
> NULL pointer dereferences.
On SMP systems, the race conditons and NULL pointer dereferences happen if
the done_isr and err_isr to execute concurrently on different CPUs
>
> Fix by:
> - In done_isr: abort if ERROR bit is set or DONE bit was already cleared
> - In err_isr: clear both DONE and ERROR bits to prevent done_isr from
> processing the same transaction
>
> Link: https://www.sifive.com/document-file/freedom-u540-c000-manual [1]
> Link: https://www.sifive.com/document-file/freedom-u740-c000-manual [2]
> Fixes: 6973886ad58e ("dmaengine: sf-pdma: add platform DMA support for HiFive Unleashed A00")
> Cc: stable@xxxxxxxxxxxxxxx
> Signed-off-by: Max Hsu <max.hsu@xxxxxxxxxx>
> ---
> drivers/dma/sf-pdma/sf-pdma.c | 20 ++++++++++++++++++--
> 1 file changed, 18 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/dma/sf-pdma/sf-pdma.c b/drivers/dma/sf-pdma/sf-pdma.c
> index 7ad3c29be146..ac7d3b127a24 100644
> --- a/drivers/dma/sf-pdma/sf-pdma.c
> +++ b/drivers/dma/sf-pdma/sf-pdma.c
> @@ -346,9 +346,25 @@ static irqreturn_t sf_pdma_done_isr(int irq, void *dev_id)
> struct sf_pdma_chan *chan = dev_id;
> struct pdma_regs *regs = &chan->regs;
> u64 residue;
> + u32 control_reg;
>
> spin_lock(&chan->vchan.lock);
> - writel((readl(regs->ctrl)) & ~PDMA_DONE_STATUS_MASK, regs->ctrl);
> + control_reg = readl(regs->ctrl);
> + if (control_reg & PDMA_ERR_STATUS_MASK) {
> + spin_unlock(&chan->vchan.lock);
> + return IRQ_HANDLED;
> + }
> +
> + /*
> + * Check if DONE bit is still set. If not, the error ISR on another
> + * CPU has already cleared it, so abort to avoid double-processing.
> + */
> + if (!(control_reg & PDMA_DONE_STATUS_MASK)) {
> + spin_unlock(&chan->vchan.lock);
> + return IRQ_HANDLED;
> + }
> +
> + writel((control_reg & ~PDMA_DONE_STATUS_MASK), regs->ctrl);
> residue = readq(regs->residue);
>
> if (!residue) {
> @@ -375,7 +391,7 @@ static irqreturn_t sf_pdma_err_isr(int irq, void *dev_id)
> struct pdma_regs *regs = &chan->regs;
>
> spin_lock(&chan->lock);
> - writel((readl(regs->ctrl)) & ~PDMA_ERR_STATUS_MASK, regs->ctrl);
> + writel((readl(regs->ctrl)) & ~(PDMA_DONE_STATUS_MASK | PDMA_ERR_STATUS_MASK), regs->ctrl);
> spin_unlock(&chan->lock);
>
> tasklet_schedule(&chan->err_tasklet);
ideally, it'd better handle by one function
sf_pdma_isr()
{
stat = readl(regs->ctrl);
writel(stat & ~(PDMA_DONE_STATUS_MASK | PDMA_ERR_STATUS_MASK), regs->ctrl);
if (err)
return err_handle();
if (done)
return done_handle();
return IRQ_NONE;
}
Anyways, this also work
Reviewed-by: Frank Li <Frank.Li@xxxxxxx>
>
> --
> 2.43.0
>