Re: [PATCH drm-misc-next v3 6/7] drm/gpuvm: generalize dma_resv/extobj handling and GEM validation

From: Danilo Krummrich
Date: Mon Sep 11 2023 - 18:43:20 EST


On Mon, Sep 11, 2023 at 04:45:26PM +0200, Boris Brezillon wrote:
> On Sat, 9 Sep 2023 17:31:13 +0200
> Danilo Krummrich <dakr@xxxxxxxxxx> wrote:
>
> > @@ -807,6 +1262,14 @@ drm_gpuvm_bo_destroy(struct kref *kref)
> >
> > drm_gem_gpuva_assert_lock_held(vm_bo->obj);
> >
> > + spin_lock(&gpuvm->extobj.lock);
> > + list_del(&vm_bo->list.entry.extobj);
> > + spin_unlock(&gpuvm->extobj.lock);
> > +
> > + spin_lock(&gpuvm->evict.lock);
> > + list_del(&vm_bo->list.entry.evict);
> > + spin_unlock(&gpuvm->evict.lock);
> > +
> > list_del(&vm_bo->list.entry.gem);
> >
> > drm_gem_object_put(obj);
>
> I ran into a UAF situation when the drm_gpuvm_bo object is the last
> owner of obj, because the lock that's supposed to be held when calling
> this function (drm_gem_gpuva_assert_lock_held() call above), belongs to
> obj (either obj->resv, or a driver specific lock that's attached to the
> driver-specific GEM object). I worked around it by taking a ref to obj
> before calling lock()+drm_gpuvm_bo_put()+unlock(), and releasing it
> after I'm node with the lock, but that just feels wrong.
>
As mentioned in a previous reply, I think we want to bring the dedicated GEM
gpuva list lock back instead of abusing the dma-resv lock. This way we can
handle locking internally and don't run into such issues.

There is also no reason for a driver to already hold the GEM gpuva list lock
when when calling drm_gpuvm_bo_put(). Drivers would only acquire the lock to
iterate the GEMs list of drm_gpuvm_bos or the drm_gpuvm_bos list of drm_gpuvas.
And dropping the drm_gpuvm_bo from within such a loop is forbidden anyways.