Re: [PATCH] dmaengine: plx_dma: Fix potential deadlock on &plxdev->ring_lock

From: Logan Gunthorpe
Date: Wed Jul 26 2023 - 17:10:59 EST




On 2023-07-26 15:00, Christophe JAILLET wrote:
> Le 26/07/2023 à 17:57, Logan Gunthorpe a écrit :
>>
>>
>> On 2023-07-26 04:48, Chengfeng Ye wrote:
>>> As plx_dma_process_desc() is invoked by both tasklet plx_dma_desc_task()
>>> under softirq context and plx_dma_tx_status() callback that executed
>>> under
>>> process context, the lock aquicision of &plxdev->ring_lock inside
>>> plx_dma_process_desc() should disable irq otherwise deadlock could
>>> happen
>>> if the irq preempts the execution of process context code while the lock
>>> is held in process context on the same CPU.
>>>
>>> Possible deadlock scenario:
>>> plx_dma_tx_status()
>>>      -> plx_dma_process_desc()
>>>      -> spin_lock(&plxdev->ring_lock)
>>>          <tasklet softirq>
>>>          -> plx_dma_desc_task()
>>>          -> plx_dma_process_desc()
>>>          -> spin_lock(&plxdev->ring_lock) (deadlock here)
>>>
>>> This flaw was found by an experimental static analysis tool I am
>>> developing
>>> for irq-related deadlock.
>>>
>>> The tentative patch fixes the potential deadlock by
>>> spin_lock_irqsave() in
>>> plx_dma_process_desc() to disable irq while lock is held.
>>>
>>> Signed-off-by: Chengfeng Ye <dg573847474@xxxxxxxxx>
>>
>> Makes sense. Thanks!
>>
>> Reviewed-by: Logan Gunthorpe <logang@xxxxxxxxxxxx>
>>
>> Logan
>>
>
> Hi,
>
> as explained in another reply [1], would spin_lock_bh() be enough in
> such a case?

The driver originally used spin_lock_bh(). It was removed by Yunbo Yu in
2022 who said that it was unnecessary to be used with a tasklet:

1d05a0bdb420 ("dmaengine: plx_dma: Move spin_lock_bh() to spin_lock() ")

If spin_lock_bh() is correct (which is what I originally thought when I
wrote the driver, though I'm a bit confused now) then I guess that
Yunbo's change was just incorrect. It sounded sensible at the time, but
it looks like there are two call sites of plx_dma_desc_task(): one in
the tasklet and one not in the tasklet. The one not in the tasklet needs
to use the bh version.

So perhaps we should just revert 1d05a0bdb420?

Logan