Re: [PATCH v2 1/2] media: docs-rst: Document memory-to-memory video decoder interface

From: Tomasz Figa
Date: Wed Feb 06 2019 - 00:43:44 EST


On Wed, Jan 30, 2019 at 1:02 PM Nicolas Dufresne <nicolas@xxxxxxxxxxxx> wrote:
>
> Le vendredi 25 janvier 2019 Ã 12:27 +0900, Tomasz Figa a Ãcrit :
> > On Fri, Jan 25, 2019 at 4:55 AM Nicolas Dufresne <nicolas@xxxxxxxxxxxx> wrote:
> > > Le jeudi 24 janvier 2019 Ã 18:06 +0900, Tomasz Figa a Ãcrit :
> > > > > Actually I just realized the last point might not even be achievable
> > > > > for some of the decoders (s5p-mfc, mtk-vcodec), as they don't report
> > > > > which frame originates from which bitstream buffer and the driver just
> > > > > picks the most recently consumed OUTPUT buffer to copy the timestamp
> > > > > from. (s5p-mfc actually "forgets" to set the timestamp in some cases
> > > > > too...)
> > > > >
> > > > > I need to think a bit more about this.
> > > >
> > > > Actually I misread the code. Both s5p-mfc and mtk-vcodec seem to
> > > > correctly match the buffers.
> > >
> > > Ok good, since otherwise it would have been a regression in MFC driver.
> > > This timestamp passing thing could in theory be made optional though,
> > > it lives under some COPY_TIMESTAMP kind of flag. What that means though
> > > is that a driver without such a capability would need to signal dropped
> > > frames using some other mean.
> > >
> > > In userspace, the main use is to match the produced frame against a
> > > userspace specific list of frames. At least this seems to be the case
> > > in Gst and Chromium, since the userspace list contains a superset of
> > > the metadata found in the v4l2_buffer.
> > >
> > > Now, using the produced timestamp, userspace can deduce frame that the
> > > driver should have produced but didn't (could be a deadline case codec,
> > > or simply the frames where corrupted). It's quite normal for a codec to
> > > just keep parsing until it finally find something it can decode.
> > >
> > > That's at least one way to do it, but there is other possible
> > > mechanism. The sequence number could be used, or even producing buffers
> > > with the ERROR flag set. What matters is just to give userspace a way
> > > to clear these frames, which would simply grow userspace memory usage
> > > over time.
> >
> > Is it just me or we were missing some consistent error handling then?
> >
> > I feel like the drivers should definitely return the bitstream buffers
> > with the ERROR flag, if there is a decode failure of data in the
> > buffer. Still, that could become more complicated if there is more
> > than 1 frame in that piece of bitstream, but only 1 frame is corrupted
> > (or whatever).
>
> I agree, but it might be more difficult then it looks (even FFMPEG does
> not do that). I believe the code that is processing the bitstream in
> stateful codecs is mostly unrelated from the code actually doing the
> decoding. So what might happen is that the decoding part will never
> actually allocate a buffer for the skipped / corrupted part of the
> bitstream. Also, the notion of a skipped frame is not always evident in
> when parsing H264 or HEVC NALs. There is still a full page of text just
> to explain how to detect that start of a new frame.

Right. I don't think we can guarantee that we can always correlate the
errors with exact buffers and so I phrased the paragraph about errors
in v3 in a bit more conservative way:

See the snapshot hosted by Hans (thanks!):
https://hverkuil.home.xs4all.nl/codec-api/uapi/v4l/dev-decoder.html#decoding

>
> Yet, it would be interesting to study the firmwares we have and see
> what they provide that would help making decode errors more explicit.
>

Agreed.

> >
> > Another case is when the bitstream, even if corrupted, is still enough
> > to produce some output. My intuition tells me that such CAPTURE buffer
> > should be then returned with the ERROR flag. That wouldn't still be
> > enough for any more sophisticated userspace error concealment, but
> > could still let the userspace know to perhaps drop the frame.
>
> You mean if a frame was concealed (typically the frame was decoded from
> a closed by reference instead of the expected reference). That is
> something signalled by FFPEG. We should document this possibility. I
> actually have something implemented in GStreamer. Basically if we have
> the ERROR flag with a payload size smaller then expected, I drop the
> frame and produce a drop event message, while if I have a frame with
> ERROR flag but of the right payload size, I assume it is corrupted, and
> simply flag it as corrupted, leaving to the application the decision to
> display it or not. This is a case that used to happen with some UVC
> cameras (though some have been fixed, and the UVC camera should drop
> smaller payload size buffers now).

I think it's a behavior that makes the most sense indeed.

Technically one could also consider the case of 0 < bytesused <
sizeimage, which could mean that only a part of the frame is in the
buffer. An application could try to blend it with previous frame using
some concealing algorithms. I haven't seen an app that could do such
thing, though.

Best regards,
Tomasz