Re: [PATCH v8 09/13] [media] vb2: add in-fence support to QBUF

From: Ezequiel Garcia
Date: Wed Apr 25 2018 - 16:11:18 EST


On 14 March 2018 at 12:55, Hans Verkuil <hverkuil@xxxxxxxxx> wrote:
> On 03/09/2018 09:49 AM, Gustavo Padovan wrote:
>> From: Gustavo Padovan <gustavo.padovan@xxxxxxxxxxxxx>
>>
>> Receive in-fence from userspace and add support for waiting on them
>> before queueing the buffer to the driver. Buffers can't be queued to the
>> driver before its fences signal. And a buffer can't be queue to the driver
>
> queue -> queued
>
>> out of the order they were queued from userspace. That means that even if
>> it fence signal it must wait all other buffers, ahead of it in the queue,
>
> it fence signal -> its fence signals
> wait all -> wait for all
>
>> to signal first.
>>
>> If the fence for some buffer fails we do not queue it to the driver,
>> instead we mark it as error and wait until the previous buffer is done
>> to notify userspace of the error. We wait here to deliver the buffers back
>> to userspace in order.
>>
>> v9: - rename fence to in_fence in many places
>> - handle fences signalling with error better (Hans Verkuil)
>>
>> v8: - improve comments and docs (Hans Verkuil)
>> - fix unlocking of vb->fence_cb_lock on vb2_core_qbuf (Hans Verkuil)
>> - move in-fences code that was in the out-fences patch here (Alex)
>>
>> v8: - improve comments about fences with errors
>
> v9? Two v8 entries?
>
>>
>> v7:
>> - get rid of the fence array stuff for ordering and just use
>> get_num_buffers_ready() (Hans)
>> - fix issue of queuing the buffer twice (Hans)
>> - avoid the dma_fence_wait() in core_qbuf() (Alex)
>> - merge preparation commit in
>>
>> v6:
>> - With fences always keep the order userspace queues the buffers.
>> - Protect in_fence manipulation with a lock (Brian Starkey)
>> - check if fences have the same context before adding a fence array
>> - Fix last_fence ref unbalance in __set_in_fence() (Brian Starkey)
>> - Clean up fence if __set_in_fence() fails (Brian Starkey)
>> - treat -EINVAL from dma_fence_add_callback() (Brian Starkey)
>>
>> v5: - use fence_array to keep buffers ordered in vb2 core when
>> needed (Brian Starkey)
>> - keep backward compat on the reserved2 field (Brian Starkey)
>> - protect fence callback removal with lock (Brian Starkey)
>>
>> v4:
>> - Add a comment about dma_fence_add_callback() not returning a
>> error (Hans)
>> - Call dma_fence_put(vb->in_fence) if fence signaled (Hans)
>> - select SYNC_FILE under config VIDEOBUF2_CORE (Hans)
>> - Move dma_fence_is_signaled() check to __enqueue_in_driver() (Hans)
>> - Remove list_for_each_entry() in __vb2_core_qbuf() (Hans)
>> - Remove if (vb->state != VB2_BUF_STATE_QUEUED) from
>> vb2_start_streaming() (Hans)
>> - set IN_FENCE flags on __fill_v4l2_buffer (Hans)
>> - Queue buffers to the driver as soon as they are ready (Hans)
>> - call fill_user_buffer() after queuing the buffer (Hans)
>> - add err: label to clean up fence
>> - add dma_fence_wait() before calling vb2_start_streaming()
>>
>> v3: - document fence parameter
>> - remove ternary if at vb2_qbuf() return (Mauro)
>> - do not change if conditions behaviour (Mauro)
>>
>> v2:
>> - fix vb2_queue_or_prepare_buf() ret check
>> - remove check for VB2_MEMORY_DMABUF only (Javier)
>> - check num of ready buffers to start streaming
>> - when queueing, start from the first ready buffer
>> - handle queue cancel
>>
>> Signed-off-by: Gustavo Padovan <gustavo.padovan@xxxxxxxxxxxxx>
>> ---
>> drivers/media/common/videobuf2/videobuf2-core.c | 197 ++++++++++++++++++++----
>> drivers/media/common/videobuf2/videobuf2-v4l2.c | 34 +++-
>> drivers/media/v4l2-core/Kconfig | 33 ++++
>> include/media/videobuf2-core.h | 14 +-
>> 4 files changed, 248 insertions(+), 30 deletions(-)
>>
>> diff --git a/drivers/media/common/videobuf2/videobuf2-core.c b/drivers/media/common/videobuf2/videobuf2-core.c
>> index d3f7bb33a54d..5de5e35cfc40 100644
>> --- a/drivers/media/common/videobuf2/videobuf2-core.c
>> +++ b/drivers/media/common/videobuf2/videobuf2-core.c
>> @@ -352,6 +352,7 @@ static int __vb2_queue_alloc(struct vb2_queue *q, enum vb2_memory memory,
>> vb->index = q->num_buffers + buffer;
>> vb->type = q->type;
>> vb->memory = memory;
>> + spin_lock_init(&vb->fence_cb_lock);
>> for (plane = 0; plane < num_planes; ++plane) {
>> vb->planes[plane].length = plane_sizes[plane];
>> vb->planes[plane].min_length = plane_sizes[plane];
>> @@ -891,20 +892,12 @@ void *vb2_plane_cookie(struct vb2_buffer *vb, unsigned int plane_no)
>> }
>> EXPORT_SYMBOL_GPL(vb2_plane_cookie);
>>
>> -void vb2_buffer_done(struct vb2_buffer *vb, enum vb2_buffer_state state)
>> +static void vb2_process_buffer_done(struct vb2_buffer *vb, enum vb2_buffer_state state)
>> {
>> struct vb2_queue *q = vb->vb2_queue;
>> unsigned long flags;
>> unsigned int plane;
>>
>> - if (WARN_ON(vb->state != VB2_BUF_STATE_ACTIVE))
>> - return;
>> -
>> - if (WARN_ON(state != VB2_BUF_STATE_DONE &&
>> - state != VB2_BUF_STATE_ERROR &&
>> - state != VB2_BUF_STATE_QUEUED &&
>> - state != VB2_BUF_STATE_REQUEUEING))
>> - state = VB2_BUF_STATE_ERROR;
>>
>> #ifdef CONFIG_VIDEO_ADV_DEBUG
>> /*
>> @@ -921,6 +914,9 @@ void vb2_buffer_done(struct vb2_buffer *vb, enum vb2_buffer_state state)
>> call_void_memop(vb, finish, vb->planes[plane].mem_priv);
>>
>> spin_lock_irqsave(&q->done_lock, flags);
>> + if (vb->state == VB2_BUF_STATE_ACTIVE)
>> + atomic_dec(&q->owned_by_drv_count);
>> +
>> if (state == VB2_BUF_STATE_QUEUED ||
>> state == VB2_BUF_STATE_REQUEUEING) {
>> vb->state = VB2_BUF_STATE_QUEUED;
>> @@ -929,7 +925,7 @@ void vb2_buffer_done(struct vb2_buffer *vb, enum vb2_buffer_state state)
>> list_add_tail(&vb->done_entry, &q->done_list);
>> vb->state = state;
>> }
>> - atomic_dec(&q->owned_by_drv_count);
>> +
>> spin_unlock_irqrestore(&q->done_lock, flags);
>>
>> trace_vb2_buf_done(q, vb);
>> @@ -946,6 +942,36 @@ void vb2_buffer_done(struct vb2_buffer *vb, enum vb2_buffer_state state)
>> wake_up(&q->done_wq);
>> break;
>> }
>> +
>> +}
>> +
>> +void vb2_buffer_done(struct vb2_buffer *vb, enum vb2_buffer_state state)
>> +{
>> + if (WARN_ON(vb->state != VB2_BUF_STATE_ACTIVE))
>> + return;
>> +
>> + if (WARN_ON(state != VB2_BUF_STATE_DONE &&
>> + state != VB2_BUF_STATE_ERROR &&
>> + state != VB2_BUF_STATE_QUEUED &&
>> + state != VB2_BUF_STATE_REQUEUEING))
>> + state = VB2_BUF_STATE_ERROR;
>> +
>> + vb2_process_buffer_done(vb, state);
>> +
>> + /*
>> + * Check if there is any buffer with error in the next position of the queue,
>> + * buffers whose in-fence signaled with error are not queued to the driver
>> + * and kept on the queue until the buffer before them is done, so to not
>> + * delivery buffers back to userspace in the wrong order. Here we process
>
> delivery -> deliver
>
>> + * any existing buffers with errors and wake up userspace.
>> + */
>> + for (;;) {
>> + vb = list_next_entry(vb, queued_entry);
>> + if (!vb || vb->state != VB2_BUF_STATE_ERROR)
>> + break;
>> +
>> + vb2_process_buffer_done(vb, VB2_BUF_STATE_ERROR);
>> + }
>> }
>> EXPORT_SYMBOL_GPL(vb2_buffer_done);
>>
>> @@ -1230,6 +1256,9 @@ static void __enqueue_in_driver(struct vb2_buffer *vb)
>> {
>> struct vb2_queue *q = vb->vb2_queue;
>>
>> + if (vb->in_fence && !dma_fence_is_signaled(vb->in_fence))
>> + return;
>> +
>> vb->state = VB2_BUF_STATE_ACTIVE;
>> atomic_inc(&q->owned_by_drv_count);
>>
>> @@ -1281,6 +1310,24 @@ static int __buf_prepare(struct vb2_buffer *vb, const void *pb)
>> return 0;
>> }
>>
>> +static int __get_num_ready_buffers(struct vb2_queue *q)
>> +{
>> + struct vb2_buffer *vb;
>> + int ready_count = 0;
>> + unsigned long flags;
>> +
>> + /* count num of buffers ready in front of the queued_list */
>> + list_for_each_entry(vb, &q->queued_list, queued_entry) {
>> + spin_lock_irqsave(&vb->fence_cb_lock, flags);
>> + if (vb->in_fence && !dma_fence_is_signaled(vb->in_fence))
>> + break;
>> + ready_count++;
>> + spin_unlock_irqrestore(&vb->fence_cb_lock, flags);
>> + }
>> +
>> + return ready_count;
>> +}
>> +
>> int vb2_core_prepare_buf(struct vb2_queue *q, unsigned int index, void *pb)
>> {
>> struct vb2_buffer *vb;
>> @@ -1369,9 +1416,43 @@ static int vb2_start_streaming(struct vb2_queue *q)
>> return ret;
>> }
>>
>> -int vb2_core_qbuf(struct vb2_queue *q, unsigned int index, void *pb)
>> +static void vb2_qbuf_fence_cb(struct dma_fence *f, struct dma_fence_cb *cb)
>> +{
>> + struct vb2_buffer *vb = container_of(cb, struct vb2_buffer, fence_cb);
>> + struct vb2_queue *q = vb->vb2_queue;
>> + unsigned long flags;
>> +
>> + spin_lock_irqsave(&vb->fence_cb_lock, flags);
>> + /*
>> + * If the fence signals with an error we mark the buffer as such
>> + * and avoid using it by setting it to VB2_BUF_STATE_ERROR and
>> + * not queueing it to the driver. However we can't notify the error
>> + * to userspace right now because, at the time this callback run, QBUF
>> + * returned already.
>
> returned -> has returned
>
>> + * So we delay that to DQBUF time. See comments in vb2_buffer_done()
>> + * as well.
>> + */
>> + if (vb->in_fence->error)
>> + vb->state = VB2_BUF_STATE_ERROR;
>> +
>> + dma_fence_put(vb->in_fence);
>> + vb->in_fence = NULL;
>> +
>> + if (vb->state == VB2_BUF_STATE_ERROR) {
>> + spin_unlock_irqrestore(&vb->fence_cb_lock, flags);
>> + return;
>> + }
>> +
>> + if (q->start_streaming_called)
>> + __enqueue_in_driver(vb);
>> + spin_unlock_irqrestore(&vb->fence_cb_lock, flags);
>> +}
>> +
>> +int vb2_core_qbuf(struct vb2_queue *q, unsigned int index, void *pb,
>> + struct dma_fence *in_fence)
>> {
>> struct vb2_buffer *vb;
>> + unsigned long flags;
>> int ret;
>>
>> vb = q->bufs[index];
>> @@ -1380,16 +1461,18 @@ int vb2_core_qbuf(struct vb2_queue *q, unsigned int index, void *pb)
>> case VB2_BUF_STATE_DEQUEUED:
>> ret = __buf_prepare(vb, pb);
>> if (ret)
>> - return ret;
>> + goto err;
>> break;
>> case VB2_BUF_STATE_PREPARED:
>> break;
>> case VB2_BUF_STATE_PREPARING:
>> dprintk(1, "buffer still being prepared\n");
>> - return -EINVAL;
>> + ret = -EINVAL;
>> + goto err;
>> default:
>> dprintk(1, "invalid buffer state %d\n", vb->state);
>> - return -EINVAL;
>> + ret = -EINVAL;
>> + goto err;
>> }
>>
>> /*
>> @@ -1400,6 +1483,7 @@ int vb2_core_qbuf(struct vb2_queue *q, unsigned int index, void *pb)
>> q->queued_count++;
>> q->waiting_for_buffers = false;
>> vb->state = VB2_BUF_STATE_QUEUED;
>> + vb->in_fence = in_fence;
>>
>> if (pb)
>> call_void_bufop(q, copy_timestamp, vb, pb);
>> @@ -1407,15 +1491,40 @@ int vb2_core_qbuf(struct vb2_queue *q, unsigned int index, void *pb)
>> trace_vb2_qbuf(q, vb);
>>
>> /*
>> - * If already streaming, give the buffer to driver for processing.
>> - * If not, the buffer will be given to driver on next streamon.
>> + * For explicit synchronization: If the fence didn't signal
>> + * yet we setup a callback to queue the buffer once the fence
>> + * signals, and then, return successfully. But if the fence
>
> , and then, -> and then (no commas)
>
>> + * already signaled we lose the reference we held and queue the
>> + * buffer to the driver.
>> */
>> - if (q->start_streaming_called)
>> - __enqueue_in_driver(vb);
>> + spin_lock_irqsave(&vb->fence_cb_lock, flags);
>> + if (vb->in_fence) {
>> + ret = dma_fence_add_callback(vb->in_fence, &vb->fence_cb,
>> + vb2_qbuf_fence_cb);
>> + /* is the fence signaled? */
>> + if (ret == -ENOENT) {
>> + dma_fence_put(vb->in_fence);
>> + vb->in_fence = NULL;
>> + } else if (ret) {
>> + goto unlock;
>> + }
>> + }
>>
>> - /* Fill buffer information for the userspace */
>> - if (pb)
>> - call_void_bufop(q, fill_user_buffer, vb, pb);
>> + /*
>> + * If already streaming and there is no fence to wait on
>> + * give the buffer to driver for processing.
>> + */
>> + if (q->start_streaming_called) {
>> + struct vb2_buffer *b;
>> +
>> + list_for_each_entry(b, &q->queued_list, queued_entry) {
>> + if (b->state != VB2_BUF_STATE_QUEUED)
>> + continue;
>> + if (b->in_fence)
>> + break;
>> + __enqueue_in_driver(b);
>> + }
>> + }
>>
>> /*
>> * If streamon has been called, and we haven't yet called
>> @@ -1424,14 +1533,36 @@ int vb2_core_qbuf(struct vb2_queue *q, unsigned int index, void *pb)
>> * then we can finally call start_streaming().
>> */
>> if (q->streaming && !q->start_streaming_called &&
>> - q->queued_count >= q->min_buffers_needed) {
>> + __get_num_ready_buffers(q) >= q->min_buffers_needed) {
>> ret = vb2_start_streaming(q);
>> if (ret)
>> - return ret;
>> + goto unlock;
>> }
>>
>> + spin_unlock_irqrestore(&vb->fence_cb_lock, flags);
>> +
>> + /* Fill buffer information for the userspace */
>> + if (pb)
>> + call_void_bufop(q, fill_user_buffer, vb, pb);
>> +
>> dprintk(2, "qbuf of buffer %d succeeded\n", vb->index);
>> return 0;
>> +
>> +unlock:
>> + spin_unlock_irqrestore(&vb->fence_cb_lock, flags);
>> +
>> +err:
>> + /* Fill buffer information for the userspace */
>> + if (pb)
>> + call_void_bufop(q, fill_user_buffer, vb, pb);
>> +
>> + if (vb->in_fence) {
>> + dma_fence_put(vb->in_fence);
>> + vb->in_fence = NULL;
>> + }
>> +
>> + return ret;
>> +
>> }
>> EXPORT_SYMBOL_GPL(vb2_core_qbuf);
>>
>> @@ -1642,6 +1773,8 @@ EXPORT_SYMBOL_GPL(vb2_core_dqbuf);
>> static void __vb2_queue_cancel(struct vb2_queue *q)
>> {
>> unsigned int i;
>> + struct vb2_buffer *vb;
>> + unsigned long flags;
>>
>> /*
>> * Tell driver to stop all transactions and release all queued
>> @@ -1672,6 +1805,16 @@ static void __vb2_queue_cancel(struct vb2_queue *q)
>> q->queued_count = 0;
>> q->error = 0;
>>
>> + list_for_each_entry(vb, &q->queued_list, queued_entry) {
>> + spin_lock_irqsave(&vb->fence_cb_lock, flags);
>> + if (vb->in_fence) {
>> + dma_fence_remove_callback(vb->in_fence, &vb->fence_cb);
>> + dma_fence_put(vb->in_fence);
>> + vb->in_fence = NULL;
>> + }
>> + spin_unlock_irqrestore(&vb->fence_cb_lock, flags);
>> + }
>> +
>> /*
>> * Remove all buffers from videobuf's list...
>> */
>> @@ -1742,7 +1885,7 @@ int vb2_core_streamon(struct vb2_queue *q, unsigned int type)
>> * Tell driver to start streaming provided sufficient buffers
>> * are available.
>> */
>> - if (q->queued_count >= q->min_buffers_needed) {
>> + if (__get_num_ready_buffers(q) >= q->min_buffers_needed) {
>> ret = v4l_vb2q_enable_media_source(q);
>> if (ret)
>> return ret;
>> @@ -2264,7 +2407,7 @@ static int __vb2_init_fileio(struct vb2_queue *q, int read)
>> * Queue all buffers.
>> */
>> for (i = 0; i < q->num_buffers; i++) {
>> - ret = vb2_core_qbuf(q, i, NULL);
>> + ret = vb2_core_qbuf(q, i, NULL, NULL);
>> if (ret)
>> goto err_reqbufs;
>> fileio->bufs[i].queued = 1;
>> @@ -2443,7 +2586,7 @@ static size_t __vb2_perform_fileio(struct vb2_queue *q, char __user *data, size_
>>
>> if (copy_timestamp)
>> b->timestamp = ktime_get_ns();
>> - ret = vb2_core_qbuf(q, index, NULL);
>> + ret = vb2_core_qbuf(q, index, NULL, NULL);
>> dprintk(5, "vb2_dbuf result: %d\n", ret);
>> if (ret)
>> return ret;
>> @@ -2546,7 +2689,7 @@ static int vb2_thread(void *data)
>> if (copy_timestamp)
>> vb->timestamp = ktime_get_ns();
>> if (!threadio->stop)
>> - ret = vb2_core_qbuf(q, vb->index, NULL);
>> + ret = vb2_core_qbuf(q, vb->index, NULL, NULL);
>> call_void_qop(q, wait_prepare, q);
>> if (ret || threadio->stop)
>> break;
>> diff --git a/drivers/media/common/videobuf2/videobuf2-v4l2.c b/drivers/media/common/videobuf2/videobuf2-v4l2.c
>> index ad1e032c3bf5..1df5dd01c0cd 100644
>> --- a/drivers/media/common/videobuf2/videobuf2-v4l2.c
>> +++ b/drivers/media/common/videobuf2/videobuf2-v4l2.c
>> @@ -23,6 +23,7 @@
>> #include <linux/sched.h>
>> #include <linux/freezer.h>
>> #include <linux/kthread.h>
>> +#include <linux/sync_file.h>
>>
>> #include <media/v4l2-dev.h>
>> #include <media/v4l2-fh.h>
>> @@ -178,6 +179,17 @@ static int vb2_queue_or_prepare_buf(struct vb2_queue *q, struct v4l2_buffer *b,
>> return -EINVAL;
>> }
>>
>> + if ((b->fence_fd != 0 && b->fence_fd != -1) &&
>> + !(b->flags & V4L2_BUF_FLAG_IN_FENCE)) {
>> + dprintk(1, "%s: fence_fd set without IN_FENCE flag\n", opname);
>> + return -EINVAL;
>> + }
>> +
>> + if (b->fence_fd < 0 && (b->flags & V4L2_BUF_FLAG_IN_FENCE)) {
>> + dprintk(1, "%s: IN_FENCE flag set but no fence_fd\n", opname);
>> + return -EINVAL;
>> + }
>> +
>> return __verify_planes_array(q->bufs[b->index], b);
>> }
>>
>> @@ -203,9 +215,14 @@ static void __fill_v4l2_buffer(struct vb2_buffer *vb, void *pb)
>> b->timestamp = ns_to_timeval(vb->timestamp);
>> b->timecode = vbuf->timecode;
>> b->sequence = vbuf->sequence;
>> - b->fence_fd = 0;
>> b->reserved = 0;
>>
>> + b->fence_fd = 0;
>> + if (vb->in_fence)
>> + b->flags |= V4L2_BUF_FLAG_IN_FENCE;
>> + else
>> + b->flags &= ~V4L2_BUF_FLAG_IN_FENCE;
>> +
>
> I was wondering: if the in-fence was already triggered this flag is not set because
> vb->in_fence will be NULL. Would it be useful to add a vb2 flag that is set when the
> in-fence triggered and report that here somehow? It might be useful for debugging.
>
> Just wondering.
>

I'm also wondering :-) ... does it make sense to use a flag for this?

#define V4L2_BUF_FLAG_IN_FENCE 0x00200000
#define V4L2_BUF_FLAG_OUT_FENCE 0x00400000
#define V4L2_BUF_FLAG_IN_FENCE_SIGNALED 0x00800000

Aren't we using a very limited resource? Is it worry?

Thanks,
--
Ezequiel GarcÃa, VanguardiaSur
www.vanguardiasur.com.ar