Re: [linux-sunxi] [PATCH v2 1/2] media: v4l: Add definitions for the HEVC slice format and controls
From: Ayaka
Date: Thu Jan 24 2019 - 07:23:39 EST
Sent from my iPad
> On Jan 24, 2019, at 6:27 PM, Paul Kocialkowski <paul.kocialkowski@xxxxxxxxxxx> wrote:
>
> Hi,
>
>> On Thu, 2019-01-10 at 21:32 +0800, ayaka wrote:
>> I forget a important thing, for the rkvdec and rk hevc decoder, it would
>> requests cabac table, scaling list, picture parameter set and reference
>> picture storing in one or various of DMA buffers. I am not talking about
>> the data been parsed, the decoder would requests a raw data.
>>
>> For the pps and rps, it is possible to reuse the slice header, just let
>> the decoder know the offset from the bitstream bufer, I would suggest to
>> add three properties(with sps) for them. But I think we need a method to
>> mark a OUTPUT side buffer for those aux data.
>
> I'm quite confused about the hardware implementation then. From what
> you're saying, it seems that it takes the raw bitstream elements rather
> than parsed elements. Is it really a stateless implementation?
>
> The stateless implementation was designed with the idea that only the
> raw slice data should be passed in bitstream form to the decoder. For
> H.264, it seems that some decoders also need the slice header in raw
> bitstream form (because they take the full slice NAL unit), see the
> discussions in this thread:
> media: docs-rst: Document m2m stateless video decoder interface
Stateless just mean it wonât track the previous result, but I donât think you can define what a date the hardware would need. Even you just build a dpb for the decoder, it is still stateless, but parsing less or more data from the bitstream doesnât stop a decoder become a stateless decoder.
>
> Can you detail exactly what the rockchip decoder absolutely needs in
> raw bitstream format?
>
> Cheers,
>
> Paul
>
>>> On 1/8/19 6:00 PM, Ayaka wrote:
>>> Sent from my iPad
>>>
>>>> On Jan 8, 2019, at 4:38 PM, Paul Kocialkowski <paul.kocialkowski@xxxxxxxxxxx> wrote:
>>>>
>>>> Hi,
>>>>
>>>>> On Tue, 2019-01-08 at 09:16 +0800, Ayaka wrote:
>>>>>
>>>>> Sent from my iPad
>>>>>
>>>>>> On Jan 7, 2019, at 5:57 PM, Paul Kocialkowski <paul.kocialkowski@xxxxxxxxxxx> wrote:
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>>>> On Mon, 2019-01-07 at 11:49 +0800, Randy Li wrote:
>>>>>>>> On 12/12/18 8:51 PM, Paul Kocialkowski wrote:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> On Wed, 2018-12-05 at 21:59 +0100, Jernej Åkrabec wrote:
>>>>>>>>
>>>>>>>>>> +
>>>>>>>>>> +#define V4L2_HEVC_DPB_ENTRY_RPS_ST_CURR_BEFORE 0x01
>>>>>>>>>> +#define V4L2_HEVC_DPB_ENTRY_RPS_ST_CURR_AFTER 0x02
>>>>>>>>>> +#define V4L2_HEVC_DPB_ENTRY_RPS_LT_CURR 0x03
>>>>>>>>>> +
>>>>>>>>>> +#define V4L2_HEVC_DPB_ENTRIES_NUM_MAX 16
>>>>>>>>>> +
>>>>>>>>>> +struct v4l2_hevc_dpb_entry {
>>>>>>>>>> + __u32 buffer_tag;
>>>>>>>>>> + __u8 rps;
>>>>>>>>>> + __u8 field_pic;
>>>>>>>>>> + __u16 pic_order_cnt[2];
>>>>>>>>>> +};
>>>>>>> Please add a property for reference index, if that rps is not used for
>>>>>>> this, some device would request that(not the rockchip one). And
>>>>>>> Rockchip's VDPU1 and VDPU2 for AVC would request a similar property.
>>>>>> What exactly is that reference index? Is it a bitstream element or
>>>>>> something deduced from the bitstream?
>>>>>>
>>>>> picture order count(POC) for HEVC and frame_num in AVC. I think it is
>>>>> the number used in list0(P slice and B slice) and list1(B slice).
>>>> The picture order count is already the last field of the DPB entry
>>>> structure. There is one for each field picture.
>>> As we are not sure whether there is a field coded slice or CTU, I would hold this part and else about the field.
>>>>>>> Adding another buffer_tag for referring the memory of the motion vectors
>>>>>>> for each frames. Or a better method is add a meta data to echo picture
>>>>>>> buffer, since the picture output is just the same as the original,
>>>>>>> display won't care whether the motion vectors are written the button of
>>>>>>> picture or somewhere else.
>>>>>> The motion vectors are passed as part of the raw bitstream data, in the
>>>>>> slices. Is there a case where the motion vectors are coded differently?
>>>>> No, it is an additional cache for decoder, even FFmpeg having such
>>>>> data, I think allwinner must output it into somewhere.
>>>> Ah yes I see what you mean! This is handled internally by our driver
>>>> and not exposed to userspace. I don't think it would be a good idea to
>>>> expose this cache or request that userspace allocates it like a video
>>>> buffer.
>>>>
>>> No, usually the driver should allocate, as the user space have no idea on size of each devices.
>>> But for advantage user, application can fix a broken picture with a proper data or analysis a object motion from that.
>>> So I would suggest attaching this information to a picture buffer as a meta data.
>>>>>>>>>> +
>>>>>>>>>> +struct v4l2_hevc_pred_weight_table {
>>>>>>>>>> + __u8 luma_log2_weight_denom;
>>>>>>>>>> + __s8 delta_chroma_log2_weight_denom;
>>>>>>>>>> +
>>>>>>>>>> + __s8 delta_luma_weight_l0[V4L2_HEVC_DPB_ENTRIES_NUM_MAX];
>>>>>>>>>> + __s8 luma_offset_l0[V4L2_HEVC_DPB_ENTRIES_NUM_MAX];
>>>>>>>>>> + __s8 delta_chroma_weight_l0[V4L2_HEVC_DPB_ENTRIES_NUM_MAX][2];
>>>>>>>>>> + __s8 chroma_offset_l0[V4L2_HEVC_DPB_ENTRIES_NUM_MAX][2];
>>>>>>>>>> +
>>>>>>>>>> + __s8 delta_luma_weight_l1[V4L2_HEVC_DPB_ENTRIES_NUM_MAX];
>>>>>>>>>> + __s8 luma_offset_l1[V4L2_HEVC_DPB_ENTRIES_NUM_MAX];
>>>>>>>>>> + __s8 delta_chroma_weight_l1[V4L2_HEVC_DPB_ENTRIES_NUM_MAX][2];
>>>>>>>>>> + __s8 chroma_offset_l1[V4L2_HEVC_DPB_ENTRIES_NUM_MAX][2];
>>>>>>>>>> +};
>>>>>>>>>> +
>>>>>>> Those properties I think are not necessary are applying for the
>>>>>>> Rockchip's device, may not work for the others.
>>>>>> Yes, it's possible that some of the elements are not necessary for some
>>>>>> decoders. What we want is to cover all the elements that might be
>>>>>> required for a decoder.
>>>>> I wonder whether allwinner need that, those sao flag usually ignored
>>>>> by decoder in design. But more is better than less, it is hard to
>>>>> extend a v4l2 structure in the future, maybe a new HEVC profile
>>>>> would bring a new property, it is still too early for HEVC.
>>>> Yes this is used by our decoder. The idea is to have all the basic
>>>> bitstream elements in the structures (even if some decoders don't use
>>>> them all) and add others for extension as separate controls later.
>>>>
>>>>>>>>>> +struct v4l2_ctrl_hevc_slice_params {
>>>>>>>>>> + __u32 bit_size;
>>>>>>>>>> + __u32 data_bit_offset;
>>>>>>>>>> +
>>>>>>>>>> + /* ISO/IEC 23008-2, ITU-T Rec. H.265: NAL unit header */
>>>>>>>>>> + __u8 nal_unit_type;
>>>>>>>>>> + __u8 nuh_temporal_id_plus1;
>>>>>>>>>> +
>>>>>>>>>> + /* ISO/IEC 23008-2, ITU-T Rec. H.265: General slice segment header */
>>>>>>>>>> + __u8 slice_type;
>>>>>>>>>> + __u8 colour_plane_id;
>>>>>>> ----------------------------------------------------------------------------
>>>>>>>>>> + __u16 slice_pic_order_cnt;
>>>>>>>>>> + __u8 slice_sao_luma_flag;
>>>>>>>>>> + __u8 slice_sao_chroma_flag;
>>>>>>>>>> + __u8 slice_temporal_mvp_enabled_flag;
>>>>>>>>>> + __u8 num_ref_idx_l0_active_minus1;
>>>>>>>>>> + __u8 num_ref_idx_l1_active_minus1;
>>>>>>> Rockchip's decoder doesn't use this part.
>>>>>>>>>> + __u8 mvd_l1_zero_flag;
>>>>>>>>>> + __u8 cabac_init_flag;
>>>>>>>>>> + __u8 collocated_from_l0_flag;
>>>>>>>>>> + __u8 collocated_ref_idx;
>>>>>>>>>> + __u8 five_minus_max_num_merge_cand;
>>>>>>>>>> + __u8 use_integer_mv_flag;
>>>>>>>>>> + __s8 slice_qp_delta;
>>>>>>>>>> + __s8 slice_cb_qp_offset;
>>>>>>>>>> + __s8 slice_cr_qp_offset;
>>>>>>>>>> + __s8 slice_act_y_qp_offset;
>>>>>>>>>> + __s8 slice_act_cb_qp_offset;
>>>>>>>>>> + __s8 slice_act_cr_qp_offset;
>>>>>>>>>> + __u8 slice_deblocking_filter_disabled_flag;
>>>>>>>>>> + __s8 slice_beta_offset_div2;
>>>>>>>>>> + __s8 slice_tc_offset_div2;
>>>>>>>>>> + __u8 slice_loop_filter_across_slices_enabled_flag;
>>>>>>>>>> +
>>>>>>>>>> + /* ISO/IEC 23008-2, ITU-T Rec. H.265: Picture timing SEI message */
>>>>>>>>>> + __u8 pic_struct;
>>>>>>> I think the decoder doesn't care about this, it is used for display.
>>>>>> The purpose of this field is to indicate whether the current picture is
>>>>>> a progressive frame or an interlaced field picture, which is useful for
>>>>>> decoding.
>>>>>>
>>>>>> At least our decoder has a register field to indicate frame/top
>>>>>> field/bottom field, so we certainly need to keep the info around.
>>>>>> Looking at the spec and the ffmpeg implementation, it looks like this
>>>>>> flag of the bitstream is the usual way to report field coding.
>>>>> It depends whether the decoder cares about scan type or more, I
>>>>> wonder prefer general_interlaced_source_flag for just scan type, it
>>>>> would be better than reading another SEL.
>>>> Well we still need a way to indicate if the current data is top or
>>>> bottom field for interlaced. I don't think that knowing that the whole
>>>> video is interlaced would be precise enough.
>>>>
>>>> Cheers,
>>>>
>>>> Paul
>>>>
>>>>>>>>>> +
>>>>>>>>>> + /* ISO/IEC 23008-2, ITU-T Rec. H.265: General slice segment header */
>>>>>>>>>> + struct v4l2_hevc_dpb_entry dpb[V4L2_HEVC_DPB_ENTRIES_NUM_MAX];
>>>>>>>>>> + __u8 num_active_dpb_entries;
>>>>>>>>>> + __u8 ref_idx_l0[V4L2_HEVC_DPB_ENTRIES_NUM_MAX];
>>>>>>>>>> + __u8 ref_idx_l1[V4L2_HEVC_DPB_ENTRIES_NUM_MAX];
>>>>>>>>>> +
>>>>>>>>>> + __u8 num_rps_poc_st_curr_before;
>>>>>>>>>> + __u8 num_rps_poc_st_curr_after;
>>>>>>>>>> + __u8 num_rps_poc_lt_curr;
>>>>>>>>>> +
>>>>>>>>>> + /* ISO/IEC 23008-2, ITU-T Rec. H.265: Weighted prediction parameter */
>>>>>>>>>> + struct v4l2_hevc_pred_weight_table pred_weight_table;
>>>>>>>>>> +};
>>>>>>>>>> +
>>>>>>>>>> #endif
>>>>>> --
>>>>>> Paul Kocialkowski, Bootlin (formerly Free Electrons)
>>>>>> Embedded Linux and kernel engineering
>>>>>> https://bootlin.com
>>>>>>
>>>> --
>>>> Paul Kocialkowski, Bootlin (formerly Free Electrons)
>>>> Embedded Linux and kernel engineering
>>>> https://bootlin.com
>>>>
> --
> Paul Kocialkowski, Bootlin
> Embedded Linux and kernel engineering
> https://bootlin.com
>