Re: [PATCH v3] media: docs-rst: Document m2m stateless video decoder interface

From: Hans Verkuil
Date: Thu Feb 28 2019 - 05:14:23 EST


On 2/26/19 4:33 AM, Alexandre Courbot wrote:
> Hi, sorry for the delayed reply!
>
> On Wed, Feb 13, 2019 at 8:04 PM Paul Kocialkowski
> <paul.kocialkowski@xxxxxxxxxxx> wrote:
>>
>> Hi,
>>
>> On Wed, 2019-02-13 at 10:57 +0100, Hans Verkuil wrote:
>>> On 2/13/19 10:20 AM, Paul Kocialkowski wrote:
>>>> Hi,
>>>>
>>>> On Wed, 2019-02-13 at 09:59 +0100, Hans Verkuil wrote:
>>>>> On 2/13/19 6:58 AM, Alexandre Courbot wrote:
>>>>>> On Wed, Feb 13, 2019 at 2:53 PM Alexandre Courbot <acourbot@xxxxxxxxxxxx> wrote:
>>>>>>> [snip]
>>>>>>> +Buffers used as reference frames can be queued back to the ``CAPTURE`` queue as
>>>>>>> +soon as all the frames they are affecting have been queued to the ``OUTPUT``
>>>>>>> +queue. The driver will refrain from using the reference buffer as a decoding
>>>>>>> +target until all the frames depending on it are decoded.
>>>>>>
>>>>>> Just want to highlight this part in order to make sure that this is
>>>>>> indeed what we agreed on. The recent changes to vb2_find_timestamp()
>>>>>> suggest this, but maybe I misunderstood the intent. It makes the
>>>>>> kernel responsible for tracking referenced buffers and not using them
>>>>>> until all the dependent frames are decoded, something the client could
>>>>>> also do.
>>>>>
>>>>> I don't think this is quite right. Once this patch https://patchwork.linuxtv.org/patch/54275/
>>>>> is in the vb2 core will track when a buffer can no longer be used as a
>>>>> reference buffer because the underlying memory might have disappeared.
>>>>>
>>>>> The core does not check if it makes sense to use a buffer as a reference
>>>>> frame, just that it is valid memory.
>>>>>
>>>>> So the driver has to check that the timestamp refers to an existing
>>>>> buffer, but userspace has to check that it queues everything in the
>>>>> right order and that the reference buffer won't be overwritten
>>>>> before the last output buffer using that reference buffer has been
>>>>> decoded.
>>>>>
>>>>> So I would say that the second sentence in your paragraph is wrong.
>>>>>
>>>>> The first sentence isn't quite right either, but I am not really sure how
>>>>> to phrase it. It is possible to queue a reference buffer even if
>>>>> not all output buffers referring to it have been decoded, provided
>>>>> that by the time the driver starts to use this buffer this actually
>>>>> has happened.
>>>>
>>>> Is there a way we can guarantee this? Looking at the rest of the spec,
>>>> it says that capture buffers "are returned in decode order" but that
>>>> doesn't imply that they are picked up in the order they are queued.
>>>>
>>>> It seems quite troublesome for the core to check whether each queued
>>>> capture buffer is used as a reference for one of the queued requests to
>>>> decide whether to pick it up or not.
>>>
>>> The core only checks that the timestamp points to a valid buffer.
>>>
>>> It is not up to the core or the driver to do anything else. If userspace
>>> gives a reference to a wrong buffer or one that is already overwritten,
>>> then you just get bad decoded video, but nothing crashes.
>>
>> Yes, that makes sense. My concern was mainly about cases where the
>> capture buffers could be consumed by the driver in a different order
>> than they are queued by userspace (which could lead to the reference
>> buffer being reused too early). But thinking about it twice, I don't
>> see a reason why this could happen.
>
> Do we have a guarantee that it won't happen though? AFAICT the
> behavior that CAPTURE buffers must be processed in queue order is not
> documented anywhere, and not guaranteed by VB2 (even though
> implementation-wise it may currently be the case). So with the current
> state of the specification, the only safe wording I can use is "do not
> queue a reference buffer back until all the frames depending on it
> have been dequeued".
>
> However, as Hans mentioned it would be nice to be able to assume that
> the capture queue is FIFO and let user-space rely in that fact to
> queue buffers containing reference frames earlier.

I would not be opposed to adding a capability that explicitly states that
the given vb2 queue is always ordered. It would always be true for drivers
using the v4l2-mem2mem framework (and can be set there).

Unordered queues make no sense for m2m devices, at least I cannot think
of any use-case for it.

>
>>
>>>>> But this is an optimization and theoretically it can depend on the
>>>>> driver behavior. It is always safe to only queue a reference frame
>>>>> when all frames depending on it have been decoded. So I am leaning
>>>>> towards not complicating matters and keeping your first sentence
>>>>> as-is.
>>>>
>>>> Yes, I believe it would be much simpler to require userspace to only
>>>> queue capture buffers once they are no longer needed as references.
>>>
>>> I think that's what we should document, but in cases where you know
>>> the hardware (i.e. an embedded system) it should be allowed to optimize
>>> and have the application queue a capture buffer containing a reference
>>> frame even if it is still in use by already queued output buffers.
>>>
>>> That way you can achieve optimal speed and memory usage.
>>>
>>> I think this is a desirable feature.
>>
>> Yes, definitely.
>
> I guess the question comes down to "how can user-space know that the
> hardware supports this"? Do we have a flag that we can return to
> signal this behavior? Or can we just define the CAPTURE queue as being
> FIFO for stateless codecs? The latter would make sense IMHO.

As far as I know all m2m devices are ordered today. As are all video
output devices. For video capture devices I know of one driver that is unordered:
cobalt. As far as I know all other video capture devices that use vb2 or the
old videobuf are all ordered. A few very old drivers that do not use these frameworks
would have to be reviewed to see if they do anything weird.

I think the cobalt driver can be modified so that it is ordered as well.

There are three options: add a V4L2_BUF_CAP_IS_ORDERED flag, add a
V4L2_BUF_CAP_IS_UNORDERED flag, or just document and require that all v4l2 devices
are ordered (nobody cares about the cobalt driver, it's Cisco internal hardware).

I'm honestly not certain which of these options is the best, keeping in mind that
we have no idea if reordering might be needed in the future.

Regards,

Hans