Re: [PATCH] RFC: dma-buf: Add an API for importing and exporting sync files

From: Christian KÃnig
Date: Thu Feb 27 2020 - 03:28:41 EST


Am 26.02.20 um 17:46 schrieb Bas Nieuwenhuizen:
On Wed, Feb 26, 2020 at 4:29 PM Jason Ekstrand <jason@xxxxxxxxxxxxxx> wrote:
On Wed, Feb 26, 2020 at 4:05 AM Daniel Vetter <daniel@xxxxxxxx> wrote:
On Wed, Feb 26, 2020 at 10:16:05AM +0100, Christian KÃnig wrote:
[SNIP]
Just imagine that you access some DMA-buf with a shader and that operation
is presented as a fence on the DMA-bufs reservation object. And now you can
go ahead and replace that fence and free up the memory.

Tricking the Linux kernel into allocating page tables in that freed memory
is trivial and that's basically it you can overwrite page tables with your
shader and gain access to all of system memory :)

What we could do is to always make sure that the added fences will complete
later than the already existing ones, but that is also rather tricky to get
right. I wouldn't do that if we don't have a rather big use case for this.
Right. I thought about that but I'm still learning how dma_resv
works. It'd be easy enough to make a fence array that contains both
the old fence and the new fence and replace the old fence with that.
What I don't know is the proper way to replace the exclusive fence
safely. Some sort of atomic_cpxchg loop, perhaps? I presume there's
some way of doing it properly because DRM drivers are doing it all the
time.

First of all you need to grab the lock of the dma_resv object or you can't replace the exclusive nor the shared ones.

This way you don't need to do a atomic_cmpxchg or anything else and still guarantee correct ordering.

I think for an exclusive fence you may need to create a fence array
that includes the existing exclusive and shared fences in the dma_resv
combined with the added fence.

Yes, that at least gives us the correct synchronization.

However, I'm not sure what the best way is to do garbage collection on
that so that we don't get an impossibly list of fence arrays.

Exactly yes. That's also the reason why the dma_fence_chain container I came up with for the sync timeline stuff has such a rather sophisticated garbage collection.

When some of the included fences signal you need to free up the array/chain and make sure that the memory for the container can be reused.

(Note
the dma_resv has a lock that needs to be taken before adding an
exclusive fence, might be useful). Some code that does a thing like
this is __dma_resv_make_exclusive in
drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c

Wanted to move that into dma_resv.c for quite a while since there are quite a few other cases where we need this.

Regards,
Christian.

The other piece of the puzzle is that on the submit path this would
need something to ignore implicit fences. And there semantically the
question comes up whether it is safe for a driver to ignore exclusive
fences from another driver. (and then we have amdgpu which has its own
rules on exclusiveness of its shared fences based on the context. e.g.
the current option to ignore implicit fences for a buffer still syncs
on exclusive fences on the buffer).