Re: [PATCH v4 10/15] drm/shmem-helper: Take reservation lock instead of drm_gem_shmem locks

From: Daniel Vetter
Date: Wed May 11 2022 - 15:05:19 EST


On Wed, May 11, 2022 at 06:40:32PM +0300, Dmitry Osipenko wrote:
> On 5/11/22 18:29, Daniel Vetter wrote:
> > On Wed, May 11, 2022 at 06:14:00PM +0300, Dmitry Osipenko wrote:
> >> On 5/11/22 17:24, Christian König wrote:
> >>> Am 11.05.22 um 15:00 schrieb Daniel Vetter:
> >>>> On Tue, May 10, 2022 at 04:39:53PM +0300, Dmitry Osipenko wrote:
> >>>>> [SNIP]
> >>>>> Since vmapping implies implicit pinning, we can't use a separate lock in
> >>>>> drm_gem_shmem_vmap() because we need to protect the
> >>>>> drm_gem_shmem_get_pages(), which is invoked by drm_gem_shmem_vmap() to
> >>>>> pin the pages and requires the dma_resv_lock to be locked.
> >>>>>
> >>>>> Hence the problem is:
> >>>>>
> >>>>> 1. If dma-buf importer holds the dma_resv_lock and invokes
> >>>>> dma_buf_vmap() -> drm_gem_shmem_vmap(), then drm_gem_shmem_vmap() shall
> >>>>> not take the dma_resv_lock.
> >>>>>
> >>>>> 2. Since dma-buf locking convention isn't specified, we can't assume
> >>>>> that dma-buf importer holds the dma_resv_lock around dma_buf_vmap().
> >>>>>
> >>>>> The possible solutions are:
> >>>>>
> >>>>> 1. Specify the dma_resv_lock convention for dma-bufs and make all
> >>>>> drivers to follow it.
> >>>>>
> >>>>> 2. Make only DRM drivers to hold dma_resv_lock around dma_buf_vmap().
> >>>>> Other non-DRM drivers will get the lockdep warning.
> >>>>>
> >>>>> 3. Make drm_gem_shmem_vmap() to take the dma_resv_lock and get deadlock
> >>>>> if dma-buf importer holds the lock.
> >>>>>
> >>>>> ...
> >>>> Yeah this is all very annoying.
> >>> Ah, yes that topic again :)
> >>>
> >>> I think we could relatively easily fix that by just defining and
> >>> enforcing that the dma_resv_lock must have be taken by the caller when
> >>> dma_buf_vmap() is called.
> >>>
> >>> A two step approach should work:
> >>> 1. Move the call to dma_resv_lock() into the dma_buf_vmap() function and
> >>> remove all lock taking from the vmap callback implementations.
> >>> 2. Move the call to dma_resv_lock() into the callers of dma_buf_vmap()
> >>> and enforce that the function is called with the lock held.
> >> I've doubts about the need to move out the dma_resv_lock() into the
> >> callers of dma_buf_vmap()..
> >>
> >> I looked through all the dma_buf_vmap() users and neither of them
> >> interacts with dma_resv_lock() at all, i.e. nobody takes the lock
> >> in/outside of dma_buf_vmap(). Hence it's easy and more practical to make
> >> dma_buf_mmap/vmap() to take the dma_resv_lock by themselves.
> > i915_gem_dmabuf_vmap -> i915_gem_object_pin_map_unlocked ->
> > i915_gem_object_lock -> dma_resv_lock
> >
> > And all the ttm drivers should work similarly. So there's definitely
> > drivers which grab dma_resv_lock from their vmap callback.
>
> Grr.. I'll take another look.
>
> >> It's unclear to me which driver may ever want to do the mapping under
> >> the dma_resv_lock. But if we will ever have such a driver that will need
> >> to map imported buffer under dma_resv_lock, then we could always add the
> >> dma_buf_vmap_locked() variant of the function. In this case the locking
> >> rule will sound like this:
> >>
> >> "All dma-buf importers are responsible for holding the dma-reservation
> >> lock around the dmabuf->ops->mmap/vmap() calls."
>
> Are you okay with this rule?

Yeah I think long-term it's where we want to be, just trying to find
clever ways to get there.

And I think Christian agrees with that?

> >>> It shouldn't be that hard to clean up. The last time I looked into it my
> >>> main problem was that we didn't had any easy unit test for it.
> >> Do we have any tests for dma-bufs at all? It's unclear to me what you
> >> are going to test in regards to the reservation locks, could you please
> >> clarify?
> > Unfortunately not really :-/ Only way really is to grab a driver which
> > needs vmap (those are mostly display drivers) on an imported buffer, and
> > see what happens.
> >
> > 2nd best is liberally sprinkling lockdep annotations all over the place
> > and throwing it at intel ci (not sure amd ci is accessible to the public)
> > and then hoping that's good enough. Stuff like might_lock and
> > dma_resv_assert_held.
>
> Alright

So throwing it at intel-gfx-ci can't hurt I think, but that only covers
i915 so doesn't really help with the bigger issue of catching all the
drivers.

Cheers, Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch