Re: [PATCH v1 2/5] media: rkvdec: Add an ops to check for decode errors

From: Nicolas Dufresne
Date: Tue Jun 14 2022 - 12:15:50 EST


Le mardi 14 juin 2022 à 16:44 +0200, Hans Verkuil a écrit :
> On 6/10/22 14:52, Nicolas Dufresne wrote:
> > This optional internal ops allow each codec to do their own
> > error status checking. The presence of an error is reported
> > using the ERROR buffer state. This patch have no functional
> > changes.
>
> If a buffer is returned with state ERROR, then that means that it is
> seriously corrupt and userspace is expected to drop it. You might still
> want to show it for debugging purposes, but the normal action is to drop it.

The discussion should be around the ERROR flag, and not the error state. Error
state is just an internal thing that have no meaning API wise, but turns out to
be the only way to get the ERROR flag to be set. With that in mind, this is not
what V4L2_BUF_FLAG_ERROR specification says:

> When this flag is set, the buffer has been dequeued successfully, although
> the data might have been corrupted. This is recoverable, streaming may
> continue as normal and the buffer may be reused normally. Drivers set 
> this flag when the VIDIOC_DQBUF ioctl is called.

For me "seriously corrupt" and "might have been corrupted" is very different.

>
> So this is not a valid approach for a decoder that can still produce a
> decent picture, albeit with macroblock artifacts.
>
> A separate control that can be returned as part of the request and contains
> some sort of error indication would be more appropriate.
>
> Buffer state ERROR is really meant for e.g. DMA errors and it shouldn't
> be mixed with decode errors that still produce a valid picture.

The ERROR flag has been used for many years by the UVC driver to indicate a
partially received image (possibly due to DMA error). That driver went even
further and set the bytesused to the amount of bytes that was received. How this
have been interpreted (mostly due to how the spec around ERROR flag is written)
in GStreamer is that the buffer contains "some valid" data unless payload size
is 0.

As explained earlier, the decision to display "some valid" data is not something
we should decided for our users. This should be left to the application to
decide. Same goes for GStreamer, if a buffer exist but has "some valid data", we
have a GST_BUFFER_FLAG_CORRUPTED flag for it. It is then up for the application
to drop if needed for the application. I'm pretty sure some stateful decoders
also behaves like this (simply because an error is an error, regardless of the
nature of it).

It might be different today, but few years ago, dropping or not dropping was the
main difference between Apple Facetime (dropping) and the other video streaming
applications. One would freeze, the other would show "some valid data".

If you look at the outcome of a partially corrupted decoded images and the
outcome of a mid-frame DMA error (typically from a camera stream), you'll find
that these are visually the same. So it is unfair to consider these two error so
different that a new mechanism must be added. In my opinion, adding RO controls
to signal these corruption only make sense if the hardware can provide detailed
reports of what is corrupted (list/range of macro-blocks, or CTU that are
affected). Then you could measure the level of corruption, but in reality, I
doubt there would be a vast usage of this, specially that the report will likely
be inconsistent due to limited HW support.

Finally, in the bitstream decoder world, including all software decoders I've
worked with, the decode is a success only if all bits are perfectly decoded.
This is the baseline for good vs bad. Userland expected an image, and whatever
happened, in real-time scenario, it must send an image. Sending a corrupted
image, or sending the previously good image remains a decision to be made by
application. As application exist around the implemented mechanism here, I'd
prefer to go for that rather then adding a new API.

Nicolas

>
> Regards,
>
> Hans
>
> >
> > Signed-off-by: Nicolas Dufresne <nicolas.dufresne@xxxxxxxxxxxxx>
> > ---
> > drivers/staging/media/rkvdec/rkvdec.c | 10 ++++++----
> > drivers/staging/media/rkvdec/rkvdec.h | 2 ++
> > 2 files changed, 8 insertions(+), 4 deletions(-)
> >
> > diff --git a/drivers/staging/media/rkvdec/rkvdec.c b/drivers/staging/media/rkvdec/rkvdec.c
> > index 7bab7586918c..7e76f8b72885 100644
> > --- a/drivers/staging/media/rkvdec/rkvdec.c
> > +++ b/drivers/staging/media/rkvdec/rkvdec.c
> > @@ -950,6 +950,7 @@ static void rkvdec_v4l2_cleanup(struct rkvdec_dev *rkvdec)
> > static irqreturn_t rkvdec_irq_handler(int irq, void *priv)
> > {
> > struct rkvdec_dev *rkvdec = priv;
> > + struct rkvdec_ctx *ctx;
> > enum vb2_buffer_state state;
> > u32 status;
> >
> > @@ -958,12 +959,13 @@ static irqreturn_t rkvdec_irq_handler(int irq, void *priv)
> > VB2_BUF_STATE_DONE : VB2_BUF_STATE_ERROR;
> >
> > writel(0, rkvdec->regs + RKVDEC_REG_INTERRUPT);
> > - if (cancel_delayed_work(&rkvdec->watchdog_work)) {
> > - struct rkvdec_ctx *ctx;
> > + ctx = v4l2_m2m_get_curr_priv(rkvdec->m2m_dev);
> >
> > - ctx = v4l2_m2m_get_curr_priv(rkvdec->m2m_dev);
> > + if (ctx->coded_fmt_desc->ops->check_error_info)
> > + state = ctx->coded_fmt_desc->ops->check_error_info(ctx);
> > +
> > + if (cancel_delayed_work(&rkvdec->watchdog_work))
> > rkvdec_job_finish(ctx, state);
> > - }
> >
> > return IRQ_HANDLED;
> > }
> > diff --git a/drivers/staging/media/rkvdec/rkvdec.h b/drivers/staging/media/rkvdec/rkvdec.h
> > index 633335ebb9c4..4ae8e6c6b03c 100644
> > --- a/drivers/staging/media/rkvdec/rkvdec.h
> > +++ b/drivers/staging/media/rkvdec/rkvdec.h
> > @@ -73,6 +73,8 @@ struct rkvdec_coded_fmt_ops {
> > struct vb2_v4l2_buffer *dst_buf,
> > enum vb2_buffer_state result);
> > int (*try_ctrl)(struct rkvdec_ctx *ctx, struct v4l2_ctrl *ctrl);
> > + /* called from IRQ handler */
> > + int (*check_error_info)(struct rkvdec_ctx *ctx);
> > };
> >
> > struct rkvdec_coded_fmt_desc {
>