Re: [PATCH 2/2] media: verisilicon: add WebP decoding support

From: Hugues FRUCHET
Date: Thu Sep 12 2024 - 10:03:15 EST


Thanks Nicolas,


On 9/12/24 15:12, Nicolas Dufresne wrote:
Le jeudi 12 septembre 2024 à 14:18 +0200, Hugues FRUCHET a écrit :
Hi Nicolas,

Thanks for reviewing.

GStreamer changes are provided through this merge request:
https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/7505

Code:
https://gitlab.freedesktop.org/gstreamer/gstreamer/-/commit/138ecfac54ce85b273a26ff6f0fefe3998f8d436?merge_request_iid=7505



On 9/11/24 20:44, Nicolas Dufresne wrote:
Le mercredi 11 septembre 2024 à 13:58 -0400, Nicolas Dufresne a écrit :
Hi Hugues,

Le mercredi 11 septembre 2024 à 15:50 +0200, Hugues Fruchet a écrit :
Add WebP picture decoding support to VP8 stateless decoder.

Unless when its obvious, the commit message should explain what is being
changed.


Signed-off-by: Hugues Fruchet <hugues.fruchet@xxxxxxxxxxx>
---
drivers/media/platform/verisilicon/hantro_g1_regs.h | 1 +
drivers/media/platform/verisilicon/hantro_g1_vp8_dec.c | 7 +++++++
2 files changed, 8 insertions(+)

diff --git a/drivers/media/platform/verisilicon/hantro_g1_regs.h b/drivers/media/platform/verisilicon/hantro_g1_regs.h
index c623b3b0be18..e7d4db788e57 100644
--- a/drivers/media/platform/verisilicon/hantro_g1_regs.h
+++ b/drivers/media/platform/verisilicon/hantro_g1_regs.h
@@ -232,6 +232,7 @@
#define G1_REG_DEC_CTRL7_DCT7_START_BIT(x) (((x) & 0x3f) << 0)
#define G1_REG_ADDR_STR 0x030
#define G1_REG_ADDR_DST 0x034
+#define G1_REG_ADDR_DST_CHROMA 0x038
#define G1_REG_ADDR_REF(i) (0x038 + ((i) * 0x4))
#define G1_REG_ADDR_REF_FIELD_E BIT(1)
#define G1_REG_ADDR_REF_TOPC_E BIT(0)
diff --git a/drivers/media/platform/verisilicon/hantro_g1_vp8_dec.c b/drivers/media/platform/verisilicon/hantro_g1_vp8_dec.c
index 851eb67f19f5..c6a7584b716a 100644
--- a/drivers/media/platform/verisilicon/hantro_g1_vp8_dec.c
+++ b/drivers/media/platform/verisilicon/hantro_g1_vp8_dec.c
@@ -427,6 +427,11 @@ static void cfg_buffers(struct hantro_ctx *ctx,
dst_dma = hantro_get_dec_buf_addr(ctx, &vb2_dst->vb2_buf);
vdpu_write_relaxed(vpu, dst_dma, G1_REG_ADDR_DST);
+
+ if (hdr->flags & V4L2_VP8_FRAME_FLAG_WEBP)
+ vdpu_write_relaxed(vpu, dst_dma +
+ ctx->dst_fmt.height * ctx->dst_fmt.width,

I'm not really not fan of that type of formula using padded width/height. Not
sure if its supported already, but if we have foreign buffers with a bigger
bytesperline, the IP may endup overwriting the luma. Please use the per-plane
bytesperline, we have v4l2-common to help with that when needed.
+ G1_REG_ADDR_DST_CHROMA);

OK, I'll check that.


I have a strong impression this patch is incomplete (not generic enough). The
documentation I have indicates that the resolution range for WebP can be
different for different synthesis. See swreg54 (0xd8), if bit 19 is set, then it
can support 16K x 16K resolution. There is no other way around that then
signalling explicitly at the format level that this is webp, since otherwise you
can't know from userspace and can't enumerate the different resolution. I'm
curious what is the difference at bitstream level, would be nice to clarify too.

See below WebP image details.


I've also found that when the PP is used, you need to fill some extended
dimension (SWREG92) with the missing bit of the width/height, as the dimension
don't fit the usual register.


Yes there are additional registers to set in postproc for large image >
3472x4672 and image input bitstream larger than 16777215 bytes.
I have not tested such large images for now.
Additionally I don't have postproc support on STM32MP25.
Anyway I can guard for those limits in code...

More notes, I noticed that WebP supports having a second frame for the alpha,
similar to WebM Alpha, for that we expect 2 requests, so no issue on this front.
WebP Loss-less is a completely different codec, and should have its own format.

I think overall, from my read of the spec, that its normal VP8, but the
resolution will exceed the normal one. We also can't always enable WebP, since
it will break references.

Nicolas


As far as I have understood & tested, WebP is just an encapsulation of
VP8 video chunk:
* Webp image RIFF header
*
* 52 49 46 46 f6 00 00 00 57 45 42 50 56 50 38 20 RIFF....WEBPVP8
* ea 00 00 00 90 09 00 9d 01 2a 30 00 30 00 3e 35 .........*0.0.>5
* | \______/ \______/
* | | \__VP8 startcode
* | \__VP8 frame_tag
* |
* \__End of WebP RIFF header: 20 bytes, then VP8 chunk

At least for lossy WebP.

There are two others WebP formats which are loss-less WebP and animated
WebP but untested on my side, I don't even know if those formats are
supported by the hardware IP.


On GStreamer side, the formats are entirely seperate, image/webp vs video/x-vp8
are the mime types. Seems a lot safe to keep these two as seperate formats. They
can certainly share the same stateless frame structure, with the additional flag
imho.

Nicolas

Really very few changes needed on VP8 codebase to support WebP. On my
opinion it doesn't need a fork of codec for that, hence just the minor
addition of "WebP" signaling on uAPI see GStreamer limited changes in
VP8 codebase to support WebP:
https://gitlab.freedesktop.org/gstreamer/gstreamer/-/commit/138ecfac54ce85b273a26ff6f0fefe3998f8d436?merge_request_iid=7505

If it was identical, we'd need no flag. The requirement to use the flag is not
discoverable. What I'm guessing is that anything above 1080p needs the flag. But
then if you enable that flag, you loose the ability to use references, so that
would equally break normal VP8. It seems like a VP8 decoder is compatible with
WebP, but a WebP decoder is not compatible with VP8.

I cannot accept what you believe is a simple solution since its not discover-
able by userspace. The Hantro VP8 decoder driver is not the only VP8 driver, so
the GStreamer implementation would break randomly on other SoC.

My recommendation is to introduce V4L2_PIX_FMT_WEBP_FRAME, and make it so that
format reused 100% of the VP8_FRAME format (very little work, no flag needed
since the format holds that). This way, drivers can be very explicit through
their ENUM_FORMAT implementation, and can also expose different resolution
ranges properly.

Nicolas

OK, I'll wait for your comments on GStreamer side to propose a new kernel patchset based on this new format.


p.s. you should draft the required synthesis check and postproc code, I can test
it for you.


Thanks for that ;)



}
int hantro_g1_vp8_dec_run(struct hantro_ctx *ctx)
@@ -471,6 +476,8 @@ int hantro_g1_vp8_dec_run(struct hantro_ctx *ctx)
reg |= G1_REG_DEC_CTRL0_SKIP_MODE;
if (hdr->lf.level == 0)
reg |= G1_REG_DEC_CTRL0_FILTERING_DIS;
+ if (hdr->flags & V4L2_VP8_FRAME_FLAG_WEBP)
+ reg |= G1_REG_DEC_CTRL0_WEBP_E;
vdpu_write_relaxed(vpu, reg, G1_REG_DEC_CTRL0);
/* Frame dimensions */



BR,
Hugues.

BR,
Hugues.