Re: [RFC v2 6/8] media: uapi: Remove bit_size field from v4l2_ctrl_hevc_slice_params

From: John Cox
Date: Tue Feb 15 2022 - 11:31:55 EST


On Tue, 15 Feb 2022 17:11:12 +0100, you wrote:

>Dne torek, 15. februar 2022 ob 17:00:33 CET je John Cox napisal(a):
>> On Tue, 15 Feb 2022 10:28:55 -0500, you wrote:
>>
>> >Le mardi 15 février 2022 à 14:50 +0000, John Cox a écrit :
>> >> On Tue, 15 Feb 2022 15:35:12 +0100, you wrote:
>> >>
>> >> >
>> >> > Le 15/02/2022 à 15:17, John Cox a écrit :
>> >> > > Hi
>> >> > >
>> >> > > > The bit size of the slice could be deduced from the buffer payload
>> >> > > > so remove bit_size field to avoid duplicated the information.
>> >> > > I think this is a bad idea. In the future we are (I hope) going to
>want
>> >> > > to have an array (variable) of slice headers all referring to the
>same
>> >> > > bit buffer. When we do that we will need this field.
>> >> >
>> >> > I wonder if that could be considering like another decode mode and so
>> >> > use an other control ?
>> >>
>> >> I, personally, would be in favour of making the slice header control a
>> >> variable array just as it is. If userland can't cope with multiple
>> >> entries then just send them one at a time and the code looks exactly
>> >> like it does at the moment and if the driver can't then set max array
>> >> entries to 1.
>> >>
>> >> Having implemented this in rpi port of ffmpeg and the RPi V4L2 driver I
>> >> can say with experience that the code and effort overhead is very low.
>> >>
>> >> Either way having a multiple slice header control in the UAPI is
>> >> important for efficiency.
>> >
>> >Just to clarify the idea, we would have a single slice controls, always
>dynamic:
>> >
>> >1. For sliced based decoder
>> >
>> >The dynamic array slice control is implemented by the driver and its size
>must
>> >be 1.
>>
>> Yes
>>
>> >2. For frame based decoder that don't care for slices
>> >
>> >The dynamic array slice controls is not implement. Userland detects that at
>> >runtime, similar to the VP9 compressed headers.
>>
>> If the driver parses all the slice header then that seems plausible
>>
>> >3. For frame based decoders that needs slices (or driver that supports
>offset
>> >and can gain performance with such mode)
>> >
>> >The dynamic array slice controls is implemented, and should contain all the
>> >slices found in the OUTPUT buffer.
>> >
>> >So the reason for this bit_size (not sure why its bits though, perhaps
>someone
>> >can educate me ?)
>>
>> RPi doesn't need bits and would be happy with bytes however
>> slice_segment data isn't byte aligned at the end so its possible that
>> there might be decoders out there that want an accurate length for that.
>
>There are two fields, please don't mix them up:
>
>__u32 bit_size;
>__u32 data_bit_offset; (changed to data_byte_offset in this series)
>
>data_bit_offset/data_byte_offset is useful, while bit_size is IMO not. If you
>have multiple slices in array, you only need to know start of the slice data
>and that offset is always offset from start of the buffer (absolute, it's not
>relative to previous slice data).

No... or at least I think not. RPi needs the start and end of the
slice_segment_data elements of each slice. If slices are arranged in the
buffer with slice_segment_headers attached then I don't see how I get to
know that. Also if the OUTPUT buffer is just a bit of bitstream, which
might well be very convienient for some userspace, then it is legitimate
to have SEIs between slice headers so you can't even guarantee that your
coded slice segments are contiguous.

Regards

JC

>Best regards,
>Jernej
>
>>
>> > Would be to let the driver offset inside the the single
>> >OUTPUT/bitstream buffer in case this is not automatically found by the
>driver
>> >(or that no start-code is needed). Is that last bit correct ? If so, should
>we
>> >change it to an offset rather then a size ? Shall we allow using offesets
>inside
>> >larger buffer (e.g. to avoid some memory copies) for the Sliced Base cases ?
>>
>> I use (in the current structure) data_bit_offset to find the start of
>> each slice's slice_segment_data within the OUTPUT buffer and bit_size to
>> find the end. RPi doesn't / can't parse the slice_header and so wants
>> all of that. Decoders that do parse the header might plausably want
>> header offsets too and it would facilitate zero copy of the bit buffer.
>>
>>
>> >> Regards
>> >>
>> >> John Cox
>> >>
>> >> > > > Signed-off-by: Benjamin Gaignard <benjamin.gaignard@xxxxxxxxxxxxx>
>> >> > > > ---
>> >> > > > .../userspace-api/media/v4l/ext-ctrls-codec.rst | 3 ---
>> >> > > > drivers/staging/media/sunxi/cedrus/cedrus_h265.c | 11 +++
>+-------
>> >> > > > include/uapi/linux/v4l2-controls.h | 3 +--
>> >> > > > 3 files changed, 5 insertions(+), 12 deletions(-)
>> >> > > >
>> >> > > > diff --git a/Documentation/userspace-api/media/v4l/ext-ctrls-
>codec.rst b/Documentation/userspace-api/media/v4l/ext-ctrls-codec.rst
>> >> > > > index 3296ac3b9fca..c3ae97657fa7 100644
>> >> > > > --- a/Documentation/userspace-api/media/v4l/ext-ctrls-codec.rst
>> >> > > > +++ b/Documentation/userspace-api/media/v4l/ext-ctrls-codec.rst
>> >> > > > @@ -2965,9 +2965,6 @@ enum v4l2_mpeg_video_hevc_size_of_length_field
>-
>> >> > > > :stub-columns: 0
>> >> > > > :widths: 1 1 2
>> >> > > >
>> >> > > > - * - __u32
>> >> > > > - - ``bit_size``
>> >> > > > - - Size (in bits) of the current slice data.
>> >> > > > * - __u32
>> >> > > > - ``data_bit_offset``
>> >> > > > - Offset (in bits) to the video data in the current slice
>data.
>> >> > > > diff --git a/drivers/staging/media/sunxi/cedrus/cedrus_h265.c b/
>drivers/staging/media/sunxi/cedrus/cedrus_h265.c
>> >> > > > index 8ab2d9c6f048..db8c7475eeb8 100644
>> >> > > > --- a/drivers/staging/media/sunxi/cedrus/cedrus_h265.c
>> >> > > > +++ b/drivers/staging/media/sunxi/cedrus/cedrus_h265.c
>> >> > > > @@ -312,8 +312,8 @@ static void cedrus_h265_setup(struct cedrus_ctx
>*ctx,
>> >> > > > const struct v4l2_hevc_pred_weight_table *pred_weight_table;
>> >> > > > unsigned int width_in_ctb_luma, ctb_size_luma;
>> >> > > > unsigned int log2_max_luma_coding_block_size;
>> >> > > > + size_t slice_bytes;
>> >> > > > dma_addr_t src_buf_addr;
>> >> > > > - dma_addr_t src_buf_end_addr;
>> >> > > > u32 chroma_log2_weight_denom;
>> >> > > > u32 output_pic_list_index;
>> >> > > > u32 pic_order_cnt[2];
>> >> > > > @@ -370,8 +370,8 @@ static void cedrus_h265_setup(struct cedrus_ctx
>*ctx,
>> >> > > >
>> >> > > > cedrus_write(dev, VE_DEC_H265_BITS_OFFSET, 0);
>> >> > > >
>> >> > > > - reg = slice_params->bit_size;
>> >> > > > - cedrus_write(dev, VE_DEC_H265_BITS_LEN, reg);
>> >> > > > + slice_bytes = vb2_get_plane_payload(&run->src->vb2_buf, 0);
>> >> > > > + cedrus_write(dev, VE_DEC_H265_BITS_LEN, slice_bytes);
>> >> > > I think one of these must be wrong. bit_size is in bits,
>> >> > > vb2_get_plane_payload is in bytes?
>> >> >
>> >> > You are right it should be vb2_get_plane_payload() * 8 to get the size
>in bits.
>> >> >
>> >> > I will change that in v3.
>> >> >
>> >> > >
>> >> > > Regards
>> >> > >
>> >> > > John Cox
>> >> > >
>> >> > > > /* Source beginning and end addresses. */
>> >> > > >
>> >> > > > @@ -384,10 +384,7 @@ static void cedrus_h265_setup(struct
>cedrus_ctx *ctx,
>> >> > > >
>> >> > > > cedrus_write(dev, VE_DEC_H265_BITS_ADDR, reg);
>> >> > > >
>> >> > > > - src_buf_end_addr = src_buf_addr +
>> >> > > > - DIV_ROUND_UP(slice_params->bit_size,
>8);
>> >> > > > -
>> >> > > > - reg = VE_DEC_H265_BITS_END_ADDR_BASE(src_buf_end_addr);
>> >> > > > + reg = VE_DEC_H265_BITS_END_ADDR_BASE(src_buf_addr + slice_bytes);
>> >> > > > cedrus_write(dev, VE_DEC_H265_BITS_END_ADDR, reg);
>> >> > > >
>> >> > > > /* Coding tree block address */
>> >> > > > diff --git a/include/uapi/linux/v4l2-controls.h b/include/uapi/
>linux/v4l2-controls.h
>> >> > > > index b1a3dc05f02f..27f5d272dc43 100644
>> >> > > > --- a/include/uapi/linux/v4l2-controls.h
>> >> > > > +++ b/include/uapi/linux/v4l2-controls.h
>> >> > > > @@ -2457,7 +2457,6 @@ struct v4l2_hevc_pred_weight_table {
>> >> > > > #define V4L2_HEVC_SLICE_PARAMS_FLAG_DEPENDENT_SLICE_SEGMENT
>(1ULL << 9)
>> >> > > >
>> >> > > > struct v4l2_ctrl_hevc_slice_params {
>> >> > > > - __u32 bit_size;
>> >> > > > __u32 data_bit_offset;
>> >> > > >
>> >> > > > /* ISO/IEC 23008-2, ITU-T Rec. H.265: NAL unit header */
>> >> > > > @@ -2484,7 +2483,7 @@ struct v4l2_ctrl_hevc_slice_params {
>> >> > > > /* ISO/IEC 23008-2, ITU-T Rec. H.265: Picture timing SEI message
>*/
>> >> > > > __u8 pic_struct;
>> >> > > >
>> >> > > > - __u8 reserved;
>> >> > > > + __u8 reserved[5];
>> >> > > >
>> >> > > > /* ISO/IEC 23008-2, ITU-T Rec. H.265: General slice segment
>header */
>> >> > > > __u32 slice_segment_addr;
>>
>