Re: [PATCH v2] drm/panfrost: Fix dma_resv deadlock at drm object pin time

From: Boris Brezillon
Date: Wed Apr 24 2024 - 03:04:47 EST


On Tue, 23 Apr 2024 23:46:23 +0100
Adrián Larumbe <adrian.larumbe@xxxxxxxxxxxxx> wrote:

> When Panfrost must pin an object that is being prepared a dma-buf
> attachment for on behalf of another driver, the core drm gem object pinning
> code already takes a lock on the object's dma reservation.
>
> However, Panfrost GEM object's pinning callback would eventually try taking
> the lock on the same dma reservation when delegating pinning of the object
> onto the shmem subsystem, which led to a deadlock.
>
> This can be shown by enabling CONFIG_DEBUG_WW_MUTEX_SLOWPATH, which throws
> the following recursive locking situation:
>
> weston/3440 is trying to acquire lock:
> ffff000000e235a0 (reservation_ww_class_mutex){+.+.}-{3:3}, at: drm_gem_shmem_pin+0x34/0xb8 [drm_shmem_helper]
> but task is already holding lock:
> ffff000000e235a0 (reservation_ww_class_mutex){+.+.}-{3:3}, at: drm_gem_pin+0x2c/0x80 [drm]
>
> Fix it by assuming the object's reservation had already been locked by the
> time we reach panfrost_gem_pin.
>
> Do the same thing for the Lima driver, as it most likely suffers from the
> same issue.
>
> Cc: Thomas Zimmermann <tzimmermann@xxxxxxx>
> Cc: Dmitry Osipenko <dmitry.osipenko@xxxxxxxxxxxxx>
> Cc: Boris Brezillon <boris.brezillon@xxxxxxxxxxxxx>
> Cc: Steven Price <steven.price@xxxxxxx>
> Fixes: a78027847226 ("drm/gem: Acquire reservation lock in drm_gem_{pin/unpin}()")
> Reviewed-by: Boris Brezillon <boris.brezillon@xxxxxxxxxxxxx>
> Signed-off-by: Adrián Larumbe <adrian.larumbe@xxxxxxxxxxxxx>
> ---
> drivers/gpu/drm/lima/lima_gem.c | 9 +++++++--
> drivers/gpu/drm/panfrost/panfrost_gem.c | 8 +++++++-
> 2 files changed, 14 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/lima/lima_gem.c b/drivers/gpu/drm/lima/lima_gem.c
> index 7ea244d876ca..8a5bcf498ef6 100644
> --- a/drivers/gpu/drm/lima/lima_gem.c
> +++ b/drivers/gpu/drm/lima/lima_gem.c
> @@ -184,8 +184,13 @@ static int lima_gem_pin(struct drm_gem_object *obj)
>
> if (bo->heap_size)
> return -EINVAL;
> -
> - return drm_gem_shmem_pin(&bo->base);
> + /*
> + * Pinning can only happen in response to a prime attachment request
> + * from another driver, but dma reservation locking is already being
> + * handled by drm_gem_pin

This comment looks a bit weird now that you call a function that
doesn't have the _locked suffix. I'd be tempted to drop it, or clarify
the fact drm_gem_shmem_object_pin() expects the resv lock to be held.

> + */
> + drm_WARN_ON(obj->dev, obj->import_attach);

Should this be moved to drm_gem_shmem_[un]pin_locked() instead of being
duplicated in all overloads of ->[un]pin()?

> + return drm_gem_shmem_object_pin(obj);
> }
>
> static int lima_gem_vmap(struct drm_gem_object *obj, struct iosys_map *map)
> diff --git a/drivers/gpu/drm/panfrost/panfrost_gem.c b/drivers/gpu/drm/panfrost/panfrost_gem.c
> index d47b40b82b0b..e3fbcb020617 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_gem.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_gem.c
> @@ -192,7 +192,13 @@ static int panfrost_gem_pin(struct drm_gem_object *obj)
> if (bo->is_heap)
> return -EINVAL;
>
> - return drm_gem_shmem_pin(&bo->base);
> + /*
> + * Pinning can only happen in response to a prime attachment request
> + * from another driver, but dma reservation locking is already being
> + * handled by drm_gem_pin
> + */
> + drm_WARN_ON(obj->dev, obj->import_attach);
> + return drm_gem_shmem_object_pin(obj);
> }
>
> static enum drm_gem_object_status panfrost_gem_status(struct drm_gem_object *obj)