Re: Support for 2D engines/blitters in V4L2 and DRM
From: Nicolas Dufresne
Date: Fri Apr 19 2019 - 15:13:48 EST
Le vendredi 19 avril 2019 Ã 13:27 +0900, Tomasz Figa a Ãcrit :
> On Fri, Apr 19, 2019 at 9:30 AM Nicolas Dufresne <nicolas@xxxxxxxxxxxx> wrote:
> > Le jeudi 18 avril 2019 Ã 10:18 +0200, Daniel Vetter a Ãcrit :
> > > > It would be cool if both could be used concurrently and not just return
> > > > -EBUSY when the device is used with the other subsystem.
> > >
> > > We live in this world already :-) I think there's even patches (or merged
> > > already) to add fences to v4l, for Android.
> >
> > This work is currently suspended. It will require some feature on DRM
> > display to really make this useful, but there is also a lot of
> > challanges in V4L2. In GFX space, most of the use case are about
> > rendering as soon as possible. Though, in multimedia we have two
> > problems, we need to synchronize the frame rendering with the audio,
> > and output buffers may comes out of order due to how video CODECs are
> > made.
> >
> > In the first, we'd need a mechanism where we can schedule a render at a
> > specific time or vblank. We can of course already implement this in
> > software, but with fences, the scheduling would need to be done in the
> > driver. Then if the fence is signalled earlier, the driver should hold
> > on until the delay is met. If the fence got signalled late, we also
> > need to think of a workflow. As we can't schedule more then one render
> > in DRM at one time, I don't really see yet how to make that work.
> >
> > For the second, it's complicated on V4L2 side. Currently we signal
> > buffers when they are ready in the display order. With fences, we
> > receive early pairs buffer and fence (in decoding order). There exist
> > cases where reordering is done by the driver (stateful CODEC). We
> > cannot schedule these immediately we would need a new mechanism to know
> > which one come next. If we just reuse current mechnism, it would void
> > the fence usage since the fence will always be signalled by the time it
> > reaches DRM or other v4l2 component.
> >
> > There also other issues, for video capture pipeline, if you are not
> > rendering ASAP, you need the HW timestamp in order to schedule. Again,
> > we'd get the fence early, but the actual timestamp will be signalled at
> > the very last minutes, so we also risk of turning the fence into pure
> > overhead. Note that as we speak, I have colleagues who are
> > experimenting with frame timestamp prediction that slaves to the
> > effective timestamp (catching up over time). But we still have issues
> > when the capture driver skipped a frame (missed a capture window).
>
> Note that a fence has a timestamp internally and it can be queried for
> it from the user space if exposed as a sync file:
> https://elixir.bootlin.com/linux/v5.1-rc5/source/drivers/dma-buf/sync_file.c#L386
Don't we need something the other way around ? This seems to be the
timestamp of when it was triggered (I'm not familiar with this though).
>
> Fences in V4L2 would be also useful for stateless decoders and any
> mem-to-mem processors that operate in order, like the blitters
> mentioned here or actually camera ISPs, which can be often chained
> into relatively sophisticated pipelines.
I agree fence can be used to optimize specific corner cases. They are
not as critical in V4L2 since we have async queues. I think the use
case for fences in V4L2 is mostly to lower the latency. Not all use
cases requires such a low latency. There was argument around fences
that is simplify the the code, I haven't seen a compelling argument
demonstrating that this would be the case for V4L2 programming. The
only case is when doing V4L2 to DRM exchanges, and only in the context
where time synchronization does not matter. In fact, so far it is more
work since information starts flowing through separate events
(buffer/fence first, later timestamps and possibly critical metadata.
This might be induced by the design, but clearly there is a slight API
clash.
>
> Best regards,
> Tomasz