Re: [PATCH] [media] vb2: map dmabuf for planes on driver queue instead of vidioc_qbuf

From: Nicolas Dufresne
Date: Mon Jul 18 2016 - 09:30:37 EST


Le lundi 18 juillet 2016 Ã 10:34 +0200, Hans Verkuil a ÃcritÂ:
> On 07/15/2016 06:26 PM, Javier Martinez Canillas wrote:
> > The buffer planes' dma-buf are currently mapped when buffers are queued
> > from userspace but it's more appropriate to do the mapping when buffers
> > are queued in the driver since that's when the actual DMA operation are
> > going to happen.
>
> Does this solve anything? Once the DMA has started the behavior is the same
> as before (QBUF maps the dmabuf), only while the DMA engine hasn't started
> yet are the QBUF calls just accepted and the mapping takes place when the
> DMA is kickstarted. This makes QBUF behave inconsistently.

The expected behaviour would have been to ensure that DMABuf mapping
only happen when the driver need the buffer (as late as possible). As
you describes it, the goal is not met.

>
> You don't describe here WHY this change is needed.

It should have been explained (just like this patch should have been
marked as RFC). The is numerous reason why you don't want to spend
userspace time mapping a buffers.

First the context, mapping a DMA-Buf is a costy operation. With the
venu of DMA-Buf fences, (in implicit mode) this operation becomes even
more expensive in term of time you block userspace. By mapping in QBUF,
you do block the userspace process for a certain duration.

By delaying the mapping later, the time spent mapping is now in the
driver thread, without blocking the userspace. This allow running with
much lower latency, as the userspace can (if already available) fetch
the following buffer and put it in the queue without further delays. As
the buffers are available earlier, the streaming can be started sooner
and no time is lost.

Another reason, which was not part of our discussion, is that if you
have a display at lower framerate, it would allow appropriate frame
skipping to be implemented. Mapping all the frames, even the one that
won't be displayed would be inefficient.

So basically, what we are saying, is that the currently implemented
mechanism is a was of userspace time, reduce the benefit of using
fences and increase latency.
Â
>
> I'm not sure I agree with the TODO, and even if I did, I'm not sure I agree
> with this solution. Since queuing the buffer to the driver is not the same
> as 'just before the DMA', since there may be many buffers queued up in the
> driver and you don't know in vb2 when the buffer is at the 'just before the DMA'
> stage.

Unfortunate, but it's just software ;-P An idea would be to introduce
some new state for preparing a buffers, so the driver don't endup
waiting at an unfortunate moment. Again, this is all only needed if we
can provide the same level of buffer validation we had at QBUF. As we
don't expose to userspace the information needed to validate if a DMA-
Buf is compatible, we started (not yet merged) implementing fallback at
QBuf. Failing asynchronously would leave userspace with absolutely no
way to handle the case of incompatible DMA-Buf.

Regards,
Nicolas

p.s. I'll be away for the rest of the summer, see you in September.

>
> Regards,
>
> Hans
>
> >
> > Suggested-by: Nicolas Dufresne <nicolas.dufresne@xxxxxxxxxxxxx>
> > Signed-off-by: Javier Martinez Canillas <javier@xxxxxxxxxxxxxxx>
> >
> > ---
> >
> > Hello,
> >
> > A side effect of this change is that if the dmabuf map fails for some
> > reasons (i.e: a driver using the DMA contig memory allocator but CMA
> > not being enabled), the fail will no longer happen on VIDIOC_QBUF but
> > later (i.e: in VIDIOC_STREAMON).
> >
> > I don't know if that's an issue though but I think is worth mentioning.
> >
> > Best regards,
> > Javier
> >
> > Âdrivers/media/v4l2-core/videobuf2-core.c | 88 ++++++++++++++++++++------------
> > Â1 file changed, 54 insertions(+), 34 deletions(-)
> >
> > diff --git a/drivers/media/v4l2-core/videobuf2-core.c b/drivers/media/v4l2-core/videobuf2-core.c
> > index ca8ffeb56d72..3fdf882bf279 100644
> > --- a/drivers/media/v4l2-core/videobuf2-core.c
> > +++ b/drivers/media/v4l2-core/videobuf2-core.c
> > @@ -186,7 +186,7 @@ module_param(debug, int, 0644);
> > Â})
> > Â
> > Âstatic void __vb2_queue_cancel(struct vb2_queue *q);
> > -static void __enqueue_in_driver(struct vb2_buffer *vb);
> > +static int __enqueue_in_driver(struct vb2_buffer *vb);
> > Â
> > Â/**
> > Â * __vb2_buf_mem_alloc() - allocate video memory for the given buffer
> > @@ -1271,20 +1271,6 @@ static int __qbuf_dmabuf(struct vb2_buffer *vb, const void *pb)
> > Â vb->planes[plane].mem_priv = mem_priv;
> > Â }
> > Â
> > - /* TODO: This pins the buffer(s) withÂÂdma_buf_map_attachment()).. but
> > - Â* really we want to do this just before the DMA, not while queueing
> > - Â* the buffer(s)..
> > - Â*/
> > - for (plane = 0; plane < vb->num_planes; ++plane) {
> > - ret = call_memop(vb, map_dmabuf, vb->planes[plane].mem_priv);
> > - if (ret) {
> > - dprintk(1, "failed to map dmabuf for plane %d\n",
> > - plane);
> > - goto err;
> > - }
> > - vb->planes[plane].dbuf_mapped = 1;
> > - }
> > -
> > Â /*
> > Â Â* Now that everything is in order, copy relevant information
> > Â Â* provided by userspace.
> > @@ -1296,51 +1282,79 @@ static int __qbuf_dmabuf(struct vb2_buffer *vb, const void *pb)
> > Â vb->planes[plane].data_offset = planes[plane].data_offset;
> > Â }
> > Â
> > - if (reacquired) {
> > - /*
> > - Â* Call driver-specific initialization on the newly acquired buffer,
> > - Â* if provided.
> > - Â*/
> > - ret = call_vb_qop(vb, buf_init, vb);
> > + return 0;
> > +err:
> > + /* In case of errors, release planes that were already acquired */
> > + __vb2_buf_dmabuf_put(vb);
> > +
> > + return ret;
> > +}
> > +
> > +/**
> > + * __buf_map_dmabuf() - map dmabuf for buffer planes
> > + */
> > +static int __buf_map_dmabuf(struct vb2_buffer *vb)
> > +{
> > + int ret;
> > + unsigned int plane;
> > +
> > + for (plane = 0; plane < vb->num_planes; ++plane) {
> > + ret = call_memop(vb, map_dmabuf, vb->planes[plane].mem_priv);
> > Â if (ret) {
> > - dprintk(1, "buffer initialization failed\n");
> > - goto err;
> > + dprintk(1, "failed to map dmabuf for plane %d\n",
> > + plane);
> > + return ret;
> > Â }
> > + vb->planes[plane].dbuf_mapped = 1;
> > + }
> > +
> > + /*
> > + Â* Call driver-specific initialization on the newly
> > + Â* acquired buffer, if provided.
> > + Â*/
> > + ret = call_vb_qop(vb, buf_init, vb);
> > + if (ret) {
> > + dprintk(1, "buffer initialization failed\n");
> > + return ret;
> > Â }
> > Â
> > Â ret = call_vb_qop(vb, buf_prepare, vb);
> > Â if (ret) {
> > Â dprintk(1, "buffer preparation failed\n");
> > Â call_void_vb_qop(vb, buf_cleanup, vb);
> > - goto err;
> > + return ret;
> > Â }
> > Â
> > Â return 0;
> > -err:
> > - /* In case of errors, release planes that were already acquired */
> > - __vb2_buf_dmabuf_put(vb);
> > -
> > - return ret;
> > Â}
> > Â
> > Â/**
> > Â * __enqueue_in_driver() - enqueue a vb2_buffer in driver for processing
> > Â */
> > -static void __enqueue_in_driver(struct vb2_buffer *vb)
> > +static int __enqueue_in_driver(struct vb2_buffer *vb)
> > Â{
> > Â struct vb2_queue *q = vb->vb2_queue;
> > Â unsigned int plane;
> > + int ret;
> > Â
> > Â vb->state = VB2_BUF_STATE_ACTIVE;
> > Â atomic_inc(&q->owned_by_drv_count);
> > Â
> > Â trace_vb2_buf_queue(q, vb);
> > Â
> > + if (q->memory == VB2_MEMORY_DMABUF) {
> > + ret = __buf_map_dmabuf(vb);
> > + if (ret)
> > + return ret;
> > + }
> > +
> > Â /* sync buffers */
> > Â for (plane = 0; plane < vb->num_planes; ++plane)
> > Â call_void_memop(vb, prepare, vb->planes[plane].mem_priv);
> > Â
> > Â call_void_vb_qop(vb, buf_queue, vb);
> > +
> > + return 0;
> > Â}
> > Â
> > Âstatic int __buf_prepare(struct vb2_buffer *vb, const void *pb)
> > @@ -1438,8 +1452,11 @@ static int vb2_start_streaming(struct vb2_queue *q)
> > Â Â* If any buffers were queued before streamon,
> > Â Â* we can now pass them to driver for processing.
> > Â Â*/
> > - list_for_each_entry(vb, &q->queued_list, queued_entry)
> > - __enqueue_in_driver(vb);
> > + list_for_each_entry(vb, &q->queued_list, queued_entry) {
> > + ret = __enqueue_in_driver(vb);
> > + if (ret < 0)
> > + return ret;
> > + }
> > Â
> > Â /* Tell the driver to start streaming */
> > Â q->start_streaming_called = 1;
> > @@ -1540,8 +1557,11 @@ int vb2_core_qbuf(struct vb2_queue *q, unsigned int index, void *pb)
> > Â Â* If already streaming, give the buffer to driver for processing.
> > Â Â* If not, the buffer will be given to driver on next streamon.
> > Â Â*/
> > - if (q->start_streaming_called)
> > - __enqueue_in_driver(vb);
> > + if (q->start_streaming_called) {
> > + ret = __enqueue_in_driver(vb);
> > + if (ret)
> > + return ret;
> > + }
> > Â
> > Â /* Fill buffer information for the userspace */
> > Â if (pb)
> >

Attachment: signature.asc
Description: This is a digitally signed message part