Re: [RFC PATCH] dma-fence: Fix races of fence callbacks versus destructors by locking

From: Philipp Stanner

Date: Mon Jun 08 2026 - 11:56:28 EST


On Mon, 2026-06-08 at 17:01 +0200, Boris Brezillon wrote:
> On Mon,  8 Jun 2026 16:24:37 +0200
> Philipp Stanner <phasta@xxxxxxxxxx> wrote:
>
> > @@ -1020,11 +1024,20 @@ EXPORT_SYMBOL(dma_fence_wait_any_timeout);
> >  void dma_fence_set_deadline(struct dma_fence *fence, ktime_t deadline)
> >  {
> >   const struct dma_fence_ops *ops;
> > + unsigned long flags;
> >  
> >   rcu_read_lock();
> >   ops = rcu_dereference(fence->ops);
> > - if (ops && ops->set_deadline && !dma_fence_is_signaled(fence))
> > + if (!ops || !ops->set_deadline) {
> > + rcu_read_unlock();
> > + return;
> > + }
> > +
> > + dma_fence_lock_irqsave(fence, flags);
> > + if (!dma_fence_is_signaled_locked(fence))
> >   ops->set_deadline(fence, deadline);
>
> You can't take the fence lock around ->set_deadline(), otherwise you'll
> deadlock here [1] or here [2].
>
> > +
> > + dma_fence_unlock_irqrestore(fence, flags);
> >   rcu_read_unlock();
> >  }
>
>
> [1]https://elixir.bootlin.com/linux/v7.0.11/source/drivers/dma-buf/sw_sync.c#L182
> [2]https://elixir.bootlin.com/linux/v7.0.11/source/drivers/gpu/drm/msm/msm_fence.c#L139


If we'd port these (and maybe some we have overlooked) simultaneously,
they could completely drop their separate locking.

The fact that other parties were forced to take the fence lock in their
callbacks (and even 100% of the functions' code) actually proves that
this RFC is probably a good idea and callback-calls should be guarded
by the fence lock :]


P.