Re: [PATCH v13 01/10] drm/shmem-helper: Switch to reservation lock

From: Christian König
Date: Sun Mar 26 2023 - 05:19:37 EST


Am 25.03.23 um 15:58 schrieb Dmitry Osipenko:
On 3/15/23 16:46, Dmitry Osipenko wrote:
On 3/14/23 05:26, Dmitry Osipenko wrote:
@@ -633,7 +605,10 @@ int drm_gem_shmem_mmap(struct drm_gem_shmem_object *shmem, struct vm_area_struct
return ret;
}
+ dma_resv_lock(shmem->base.resv, NULL);
ret = drm_gem_shmem_get_pages(shmem);
+ dma_resv_unlock(shmem->base.resv);
Intel CI reported locking problem [1] here. It actually was also
reported for v12, but I missed that report because of the other noisy
reports.

[1]
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114671v2/shard-snb5/igt@prime_vgem@sync@xxxxxxxxx

The test does the following:

1. creates vgem
2. export it do dmabuf
3. mmaps dmabuf

There is an obvious deadlock there. The DRM code assumes that mmap() is
unlocked, while for dma-buf it's locked. I'll see how to fix it for v14.

Christian, there is a deadlock problem in drm_gem_shmem_mmap() once we
move drm-shmem to use resv lock. The current dma-buf locking policy
claims that importer holds the lock for mmap(), but DRM code assumes
that obj->mmap() handles the locking itself and then the same
obj->mmap() code path is used by both dma-buf DRM and a usual DRM object
paths. Hence importer -> dma_buf_mmap_internal()[takes the lock] ->
exporter -> drm_gem_shmem_mmap()[takes the lock] deadlocks.

I was looking at how to fix it and to me the best option is to change
the dma-buf locking policy, making exporter responsible for handling the
resv lock. Changing DRM code mmap lockings might be possible too [moving
locking to drm_gem_mmap_obj()], but will be very messy and doesn't feel
intuitive.

Want to get yours thoughts on this before sending out the dma-buf mmap()
policy-change patch. Does the new mmap() locking policy sound good to
you? Thanks!


IIRC we tried that before and ran into problems.

dma_buf_mmap() needs to swap the backing file of the VMA and for this calls fput() on the old file.

This fput() in turn could (in theory) grab the resv lock as well and there isn't anything we could do about that.

Just information from the back of my memory, probably best if you double check that.

Regards,
Christian.