Re: [PATCH v4 1/2] media: uapi: Add H264 low-level decoder API compound controls.

From: Tomasz Figa
Date: Fri Feb 22 2019 - 02:52:16 EST


Hi Maxime,

On Wed, Feb 20, 2019 at 11:17 PM Maxime Ripard
<maxime.ripard@xxxxxxxxxxx> wrote:
>
> From: Pawel Osciak <posciak@xxxxxxxxxxxx>
>
> Stateless video codecs will require both the H264 metadata and slices in
> order to be able to decode frames.
>
> This introduces the definitions for a new pixel format for H264 slices that
> have been parsed, as well as the structures used to pass the metadata from
> the userspace to the kernel.
>
> Co-Developped-by: Maxime Ripard <maxime.ripard@xxxxxxxxxxx>
> Signed-off-by: Pawel Osciak <posciak@xxxxxxxxxxxx>
> Signed-off-by: Guenter Roeck <groeck@xxxxxxxxxxxx>
> Signed-off-by: Maxime Ripard <maxime.ripard@xxxxxxxxxxx>

Thanks for the patch. Some comments inline.

[snip]
> +``V4L2_CID_MPEG_VIDEO_H264_SLICE_PARAMS (struct)``
> + Specifies the slice parameters (as extracted from the bitstream)
> + for the associated H264 slice data. This includes the necessary
> + parameters for configuring a stateless hardware decoding pipeline
> + for H264. The bitstream parameters are defined according to
> + :ref:`h264`. Unless there's a specific comment, refer to the
> + specification for the documentation of these fields, section 7.4.3
> + "Slice Header Semantics".

Note that this is expected to be an array, with entries for all the
slices included in the bitstream buffer.

> +
> + .. note::
> +
> + This compound control is not yet part of the public kernel API and
> + it is expected to change.
> +
> +.. c:type:: v4l2_ctrl_h264_slice_param
> +
> +.. cssclass:: longtable
> +
> +.. flat-table:: struct v4l2_ctrl_h264_slice_param
> + :header-rows: 0
> + :stub-columns: 0
> + :widths: 1 1 2
> +
> + * - __u32
> + - ``size``
> + -
> + * - __u32
> + - ``header_bit_size``
> + -
> + * - __u16
> + - ``first_mb_in_slice``
> + -
> + * - __u8
> + - ``slice_type``
> + -
> + * - __u8
> + - ``pic_parameter_set_id``
> + -
> + * - __u8
> + - ``colour_plane_id``
> + -
> + * - __u8
> + - ``redundant_pic_cnt``
> + -
> + * - __u16
> + - ``frame_num``
> + -
> + * - __u16
> + - ``idr_pic_id``
> + -
> + * - __u16
> + - ``pic_order_cnt_lsb``
> + -
> + * - __s32
> + - ``delta_pic_order_cnt_bottom``
> + -
> + * - __s32
> + - ``delta_pic_order_cnt0``
> + -
> + * - __s32
> + - ``delta_pic_order_cnt1``
> + -
> + * - struct :c:type:`v4l2_h264_pred_weight_table`
> + - ``pred_weight_table``
> + -
> + * - __u32
> + - ``dec_ref_pic_marking_bit_size``
> + -
> + * - __u32
> + - ``pic_order_cnt_bit_size``
> + -
> + * - __u8
> + - ``cabac_init_idc``
> + -
> + * - __s8
> + - ``slice_qp_delta``
> + -
> + * - __s8
> + - ``slice_qs_delta``
> + -
> + * - __u8
> + - ``disable_deblocking_filter_idc``
> + -
> + * - __s8
> + - ``slice_alpha_c0_offset_div2``
> + -
> + * - __s8
> + - ``slice_beta_offset_div2``
> + -
> + * - __u8
> + - ``num_ref_idx_l0_active_minus1``
> + -
> + * - __u8
> + - ``num_ref_idx_l1_active_minus1``
> + -
> + * - __u32
> + - ``slice_group_change_cycle``
> + -
> + * - __u8
> + - ``ref_pic_list0[32]``
> + -
> + * - __u8
> + - ``ref_pic_list1[32]``
> + -

Should we explicitly document that these are the lists after applying
the per-slice modifications, as opposed to the original order from
v4l2_ctrl_h264_decode_param?

[snip]
> + * .. _V4L2-PIX-FMT-H264-SLICE:
> +
> + - ``V4L2_PIX_FMT_H264_SLICE``
> + - 'S264'
> + - H264 parsed slice data, as extracted from the H264 bitstream.
> + This format is adapted for stateless video decoders that
> + implement an H264 pipeline (using the :ref:`codec` and
> + :ref:`media-request-api`). Metadata associated with the frame
> + to decode are required to be passed through the
> + ``V4L2_CID_MPEG_VIDEO_H264_SPS``,
> + ``V4L2_CID_MPEG_VIDEO_H264_PPS``,
> + ``V4L2_CID_MPEG_VIDEO_H264_SLICE_PARAMS`` and
> + ``V4L2_CID_MPEG_VIDEO_H264_DECODE_PARAMS`` controls and
> + scaling matrices can optionally be specified through the
> + ``V4L2_CID_MPEG_VIDEO_H264_SCALING_MATRIX`` control. See the
> + :ref:`associated Codec Control IDs <v4l2-mpeg-h264>`.
> + Exactly one output and one capture buffer must be provided for
> + use with this pixel format. The output buffer must contain the
> + appropriate number of macroblocks to decode a full
> + corresponding frame to the matching capture buffer.

What does it mean that a control can be optionally specified? A
control always has a value, so how do we decide that it was specified
or not? Should we have another control (or flag) that selects whether
to use the control? How is it better than just having the control
initialized with the default scaling matrix and always using it?

[snip]

> diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h
> index 9a920f071ff9..6443ae53597f 100644
> --- a/include/uapi/linux/videodev2.h
> +++ b/include/uapi/linux/videodev2.h
> @@ -653,6 +653,7 @@ struct v4l2_pix_format {
> #define V4L2_PIX_FMT_H264 v4l2_fourcc('H', '2', '6', '4') /* H264 with start codes */
> #define V4L2_PIX_FMT_H264_NO_SC v4l2_fourcc('A', 'V', 'C', '1') /* H264 without start codes */
> #define V4L2_PIX_FMT_H264_MVC v4l2_fourcc('M', '2', '6', '4') /* H264 MVC */
> +#define V4L2_PIX_FMT_H264_SLICE v4l2_fourcc('S', '2', '6', '4') /* H264 parsed slices */

Are we okay with adding here already, without going through staging first?

Best regards,
Tomasz