Re: [PATCH] media: hantro: make update_dpb() not leave holes in DPB

From: Jonas Karlman
Date: Fri Nov 15 2019 - 01:04:10 EST


On 2019-11-15 06:31, Alexandre Courbot wrote:
> On Fri, Nov 15, 2019 at 1:36 PM Boris Brezillon
> <boris.brezillon@xxxxxxxxxxxxx> wrote:
>> Hi Alexandre,
>>
>> On Fri, 15 Nov 2019 12:50:13 +0900
>> Alexandre Courbot <acourbot@xxxxxxxxxxxx> wrote:
>>
>>> update_dpb() reorders the DPB entries such as the same frame in two
>>> consecutive decoding requests always ends up in the same DPB slot.
>>>
>>> It first clears all the active flags in the DPB, and then checks whether
>>> the active flag is not set to decide whether an entry is a candidate for
>>> matching or not.
>>>
>>> However, this means that unused DPB entries are also candidates for
>>> matching, and an unused entry will match if we are testing it against a
>>> frame which (TopFieldOrderCount, BottomFieldOrderCount) is (0, 0).
>> Hm, I might be wrong but I thought we were supposed to re-use matching
>> entries even if the ref was not active on the last decoded frame. IIRC,
>> a ref can be active on specific decoding request (X), then inactive on
>> the next one (X+1) and active again on the following one (X+2).
>> Shouldn't we re-use the slot we used when decoding X for this ref when
>> decoding X+2?
> I am not sure how often this happens in practice (if at all), but
> maybe this logic would work as well. In this case we would need to
> mark DPB entries that are not used yet differently to avoid the issue
> that this patch attempts to fix.
>
> To give a precise example of the issue, for a stream that only uses 3
> DPB entries at max, after an IDR frame the 4th DPB entry will
> incorrectly be matched with the IDR frame of FieldOrderCount (0, 0)
> and be the only active entry for this frame. Hantro is ok with it, but
> this is not an optimal use of the DPB and MT8183 does not like that.

In my RFC series to fix decoding of field encoded video on hantro [1],
I also include a change to update_dpb that possible also fixes this issue.
I do not fully remember why I made the changes, decoding was fixed
and I have not had time to fully analyze the change to update_dpb yet.

Also note that in order to support field encoded video there are also some
uAPI changes proposed in order to signal how the dpb entry is referenced.
Matching on FieldOrderCount does not work on field encoded video because
first filed may have FieldOrderCount (X,0) and second FieldOrderCount (X,Y).

[1] https://patchwork.linuxtv.org/patch/59688/
[2] https://patchwork.linuxtv.org/patch/59689/

Regards,
Jonas

>
>>> As it turns out, this happens for the very first frame of a stream, but
>>> it is not a problem as it would be copied to the first entry anyway.
>>> More concerning is that after an IDR frame the Top/BottomFieldOrderCount
>>> can be reset to 0, and this time update_dpb() will match the IDR frame
>>> to the first unused entry of the DPB (and not the entry at index 0 as
>>> would be expected) because the first slots will have different values.
>> We could also consider resetting the DPB cache on IDR frames if that
>> works on Hantro.
> Maybe that could be enough indeed. Let me experiment with that a bit.
> I believe this would work since in practice the result would be the
> same as this patch, but for safety I'd rather have unused DPB entries
> be unambiguously identified rather than letting the (0, 0) match do
> the right thing by accident.
>
>>> The Hantro driver is ok with this, but when trying to use the same
>>> function for another driver (MT8183) I noticed decoding artefacts caused
>>> by this hole in the DPB.
>> I guess this new version passes the chromium testsuite on rk-based
>> boards. If that's the case I don't have any objection to this patch.
>>
>> Note that I was not planning to share the DPB caching logic as I
>> thought only Hantro G1 needed that trick. Have you tried passing the
>> DPB directly? Maybe it just works on mtk.
> Passing the DPB directly I get corrupted frames on a regular basis
> with MTK. I also confirmed that the firmware's expectations are what
> this function does. Using the same function in the MTK driver, the
> decoded stream is flawless.
>
>>> Fix this by maintaining a list of DPB slots that are actually in use,
>>> instead of relying on the absence of the active flag to do so. This also
>>> allows us to optimize matching a bit.
>>>
>>> Signed-off-by: Alexandre Courbot <acourbot@xxxxxxxxxxxx>
>>> ---
>>> drivers/staging/media/hantro/hantro_h264.c | 18 ++++++++++++------
>>> 1 file changed, 12 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/drivers/staging/media/hantro/hantro_h264.c b/drivers/staging/media/hantro/hantro_h264.c
>>> index 568640eab3a6..2357068b0f82 100644
>>> --- a/drivers/staging/media/hantro/hantro_h264.c
>>> +++ b/drivers/staging/media/hantro/hantro_h264.c
>>> @@ -474,14 +474,19 @@ static void update_dpb(struct hantro_ctx *ctx)
>>> {
>>> const struct v4l2_ctrl_h264_decode_params *dec_param;
>>> DECLARE_BITMAP(new, ARRAY_SIZE(dec_param->dpb)) = { 0, };
>>> + DECLARE_BITMAP(in_use, ARRAY_SIZE(dec_param->dpb)) = { 0, };
>>> DECLARE_BITMAP(used, ARRAY_SIZE(dec_param->dpb)) = { 0, };
>>> unsigned int i, j;
>>>
>>> dec_param = ctx->h264_dec.ctrls.decode;
>>>
>>> - /* Disable all entries by default. */
>>> - for (i = 0; i < ARRAY_SIZE(ctx->h264_dec.dpb); i++)
>>> + /* Mark entries in use before disabling them. */
>>> + for (i = 0; i < ARRAY_SIZE(ctx->h264_dec.dpb); i++) {
>>> + if (ctx->h264_dec.dpb[i].flags &
>>> + V4L2_H264_DPB_ENTRY_FLAG_ACTIVE)
>>> + set_bit(i, in_use);
>>> ctx->h264_dec.dpb[i].flags &= ~V4L2_H264_DPB_ENTRY_FLAG_ACTIVE;
>>> + }
>>>
>>> /* Try to match new DPB entries with existing ones by their POCs. */
>>> for (i = 0; i < ARRAY_SIZE(dec_param->dpb); i++) {
>>> @@ -492,18 +497,19 @@ static void update_dpb(struct hantro_ctx *ctx)
>>>
>>> /*
>>> * To cut off some comparisons, iterate only on target DPB
>>> - * entries which are not used yet.
>>> + * entries which are already used.
>>> */
>>> - for_each_clear_bit(j, used, ARRAY_SIZE(ctx->h264_dec.dpb)) {
>>> + for_each_set_bit(j, in_use, ARRAY_SIZE(ctx->h264_dec.dpb)) {
>>> struct v4l2_h264_dpb_entry *cdpb;
>>>
>>> cdpb = &ctx->h264_dec.dpb[j];
>>> - if (cdpb->flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE ||
>>> - !dpb_entry_match(cdpb, ndpb))
>>> + if (!dpb_entry_match(cdpb, ndpb))
>>> continue;
>>>
>>> *cdpb = *ndpb;
>>> set_bit(j, used);
>>> + /* Don't reiterate on this one. */
>>> + clear_bit(j, in_use);
>>> break;
>>> }
>>>