Re: [RFC PATCH v6 09/11] media: uapi: Add audio rate controls support
From: Shengjiu Wang
Date: Wed Oct 18 2023 - 09:51:08 EST
On Wed, Oct 18, 2023 at 9:09 PM Hans Verkuil <hverkuil@xxxxxxxxx> wrote:
>
> On 18/10/2023 14:52, Shengjiu Wang wrote:
> > On Wed, Oct 18, 2023 at 3:58 PM Hans Verkuil <hverkuil@xxxxxxxxx> wrote:
> >>
> >> On 18/10/2023 09:40, Shengjiu Wang wrote:
> >>> On Wed, Oct 18, 2023 at 3:31 PM Hans Verkuil <hverkuil@xxxxxxxxx> wrote:
> >>>>
> >>>> On 18/10/2023 09:23, Shengjiu Wang wrote:
> >>>>> On Wed, Oct 18, 2023 at 10:27 AM Shengjiu Wang <shengjiu.wang@xxxxxxxxx> wrote:
> >>>>>>
> >>>>>> On Tue, Oct 17, 2023 at 9:37 PM Hans Verkuil <hverkuil@xxxxxxxxx> wrote:
> >>>>>>>
> >>>>>>> On 17/10/2023 15:11, Shengjiu Wang wrote:
> >>>>>>>> On Mon, Oct 16, 2023 at 9:16 PM Hans Verkuil <hverkuil@xxxxxxxxx> wrote:
> >>>>>>>>>
> >>>>>>>>> Hi Shengjiu,
> >>>>>>>>>
> >>>>>>>>> On 13/10/2023 10:31, Shengjiu Wang wrote:
> >>>>>>>>>> Fixed point controls are used by the user to configure
> >>>>>>>>>> the audio sample rate to driver.
> >>>>>>>>>>
> >>>>>>>>>> Add V4L2_CID_ASRC_SOURCE_RATE and V4L2_CID_ASRC_DEST_RATE
> >>>>>>>>>> new IDs for ASRC rate control.
> >>>>>>>>>>
> >>>>>>>>>> Signed-off-by: Shengjiu Wang <shengjiu.wang@xxxxxxx>
> >>>>>>>>>> ---
> >>>>>>>>>> .../userspace-api/media/v4l/common.rst | 1 +
> >>>>>>>>>> .../media/v4l/ext-ctrls-fixed-point.rst | 36 +++++++++++++++++++
> >>>>>>>>>> .../media/v4l/vidioc-g-ext-ctrls.rst | 4 +++
> >>>>>>>>>> .../media/v4l/vidioc-queryctrl.rst | 7 ++++
> >>>>>>>>>> .../media/videodev2.h.rst.exceptions | 1 +
> >>>>>>>>>> drivers/media/v4l2-core/v4l2-ctrls-core.c | 5 +++
> >>>>>>>>>> drivers/media/v4l2-core/v4l2-ctrls-defs.c | 4 +++
> >>>>>>>>>> include/media/v4l2-ctrls.h | 2 ++
> >>>>>>>>>> include/uapi/linux/v4l2-controls.h | 13 +++++++
> >>>>>>>>>> include/uapi/linux/videodev2.h | 3 ++
> >>>>>>>>>> 10 files changed, 76 insertions(+)
> >>>>>>>>>> create mode 100644 Documentation/userspace-api/media/v4l/ext-ctrls-fixed-point.rst
> >>>>>>>>>>
> >>>>>>>>>> diff --git a/Documentation/userspace-api/media/v4l/common.rst b/Documentation/userspace-api/media/v4l/common.rst
> >>>>>>>>>> index ea0435182e44..35707edffb13 100644
> >>>>>>>>>> --- a/Documentation/userspace-api/media/v4l/common.rst
> >>>>>>>>>> +++ b/Documentation/userspace-api/media/v4l/common.rst
> >>>>>>>>>> @@ -52,6 +52,7 @@ applicable to all devices.
> >>>>>>>>>> ext-ctrls-fm-rx
> >>>>>>>>>> ext-ctrls-detect
> >>>>>>>>>> ext-ctrls-colorimetry
> >>>>>>>>>> + ext-ctrls-fixed-point
> >>>>>>>>>
> >>>>>>>>> Rename this to ext-ctrls-audio-m2m.
> >>>>>>>>>
> >>>>>>>>>> fourcc
> >>>>>>>>>> format
> >>>>>>>>>> planar-apis
> >>>>>>>>>> diff --git a/Documentation/userspace-api/media/v4l/ext-ctrls-fixed-point.rst b/Documentation/userspace-api/media/v4l/ext-ctrls-fixed-point.rst
> >>>>>>>>>> new file mode 100644
> >>>>>>>>>> index 000000000000..2ef6e250580c
> >>>>>>>>>> --- /dev/null
> >>>>>>>>>> +++ b/Documentation/userspace-api/media/v4l/ext-ctrls-fixed-point.rst
> >>>>>>>>>> @@ -0,0 +1,36 @@
> >>>>>>>>>> +.. SPDX-License-Identifier: GFDL-1.1-no-invariants-or-later
> >>>>>>>>>> +
> >>>>>>>>>> +.. _fixed-point-controls:
> >>>>>>>>>> +
> >>>>>>>>>> +***************************
> >>>>>>>>>> +Fixed Point Control Reference
> >>>>>>>>>
> >>>>>>>>> This is for audio controls. "Fixed Point" is just the type, and it doesn't make
> >>>>>>>>> sense to group fixed point controls. But it does make sense to group the audio
> >>>>>>>>> controls.
> >>>>>>>>>
> >>>>>>>>> V4L2 controls can be grouped into classes. Basically it is a way to put controls
> >>>>>>>>> into categories, and for each category there is also a control that gives a
> >>>>>>>>> description of the class (see 2.15.15 in
> >>>>>>>>> https://linuxtv.org/downloads/v4l-dvb-apis-new/driver-api/v4l2-controls.html#introduction)
> >>>>>>>>>
> >>>>>>>>> If you use e.g. 'v4l2-ctl -l' to list all the controls, then you will see that
> >>>>>>>>> they are grouped based on what class of control they are.
> >>>>>>>>>
> >>>>>>>>> So I think it would be a good idea to create a new control class for M2M audio controls,
> >>>>>>>>> instead of just adding them to the catch-all 'User Controls' class.
> >>>>>>>>>
> >>>>>>>>> Search e.g. for V4L2_CTRL_CLASS_COLORIMETRY and V4L2_CID_COLORIMETRY_CLASS to see how
> >>>>>>>>> it is done.
> >>>>>>>>>
> >>>>>>>>> M2M_AUDIO would probably be a good name for the class.
> >>>>>>>>>
> >>>>>>>>>> +***************************
> >>>>>>>>>> +
> >>>>>>>>>> +These controls are intended to support an asynchronous sample
> >>>>>>>>>> +rate converter.
> >>>>>>>>>
> >>>>>>>>> Add ' (ASRC).' at the end to indicate the common abbreviation for
> >>>>>>>>> that.
> >>>>>>>>>
> >>>>>>>>>> +
> >>>>>>>>>> +.. _v4l2-audio-asrc:
> >>>>>>>>>> +
> >>>>>>>>>> +``V4L2_CID_ASRC_SOURCE_RATE``
> >>>>>>>>>> + sets the resampler source rate.
> >>>>>>>>>> +
> >>>>>>>>>> +``V4L2_CID_ASRC_DEST_RATE``
> >>>>>>>>>> + sets the resampler destination rate.
> >>>>>>>>>
> >>>>>>>>> Document the unit (Hz) for these two controls.
> >>>>>>>>>
> >>>>>>>>>> +
> >>>>>>>>>> +.. c:type:: v4l2_ctrl_fixed_point
> >>>>>>>>>> +
> >>>>>>>>>> +.. cssclass:: longtable
> >>>>>>>>>> +
> >>>>>>>>>> +.. tabularcolumns:: |p{1.5cm}|p{5.8cm}|p{10.0cm}|
> >>>>>>>>>> +
> >>>>>>>>>> +.. flat-table:: struct v4l2_ctrl_fixed_point
> >>>>>>>>>> + :header-rows: 0
> >>>>>>>>>> + :stub-columns: 0
> >>>>>>>>>> + :widths: 1 1 2
> >>>>>>>>>> +
> >>>>>>>>>> + * - __u32
> >>>>>>>>>
> >>>>>>>>> Hmm, shouldn't this be __s32?
> >>>>>>>>>
> >>>>>>>>>> + - ``integer``
> >>>>>>>>>> + - integer part of fixed point value.
> >>>>>>>>>> + * - __s32
> >>>>>>>>>
> >>>>>>>>> and this __u32?
> >>>>>>>>>
> >>>>>>>>> You want to be able to use this generic type as a signed value.
> >>>>>>>>>
> >>>>>>>>>> + - ``fractional``
> >>>>>>>>>> + - fractional part of fixed point value, which is Q31.
> >>>>>>>>>> diff --git a/Documentation/userspace-api/media/v4l/vidioc-g-ext-ctrls.rst b/Documentation/userspace-api/media/v4l/vidioc-g-ext-ctrls.rst
> >>>>>>>>>> index f9f73530a6be..1811dabf5c74 100644
> >>>>>>>>>> --- a/Documentation/userspace-api/media/v4l/vidioc-g-ext-ctrls.rst
> >>>>>>>>>> +++ b/Documentation/userspace-api/media/v4l/vidioc-g-ext-ctrls.rst
> >>>>>>>>>> @@ -295,6 +295,10 @@ still cause this situation.
> >>>>>>>>>> - ``p_av1_film_grain``
> >>>>>>>>>> - A pointer to a struct :c:type:`v4l2_ctrl_av1_film_grain`. Valid if this control is
> >>>>>>>>>> of type ``V4L2_CTRL_TYPE_AV1_FILM_GRAIN``.
> >>>>>>>>>> + * - struct :c:type:`v4l2_ctrl_fixed_point` *
> >>>>>>>>>> + - ``p_fixed_point``
> >>>>>>>>>> + - A pointer to a struct :c:type:`v4l2_ctrl_fixed_point`. Valid if this control is
> >>>>>>>>>> + of type ``V4L2_CTRL_TYPE_FIXED_POINT``.
> >>>>>>>>>> * - void *
> >>>>>>>>>> - ``ptr``
> >>>>>>>>>> - A pointer to a compound type which can be an N-dimensional array
> >>>>>>>>>> diff --git a/Documentation/userspace-api/media/v4l/vidioc-queryctrl.rst b/Documentation/userspace-api/media/v4l/vidioc-queryctrl.rst
> >>>>>>>>>> index 4d38acafe8e1..9285f4f39eed 100644
> >>>>>>>>>> --- a/Documentation/userspace-api/media/v4l/vidioc-queryctrl.rst
> >>>>>>>>>> +++ b/Documentation/userspace-api/media/v4l/vidioc-queryctrl.rst
> >>>>>>>>>> @@ -549,6 +549,13 @@ See also the examples in :ref:`control`.
> >>>>>>>>>> - n/a
> >>>>>>>>>> - A struct :c:type:`v4l2_ctrl_av1_film_grain`, containing AV1 Film Grain
> >>>>>>>>>> parameters for stateless video decoders.
> >>>>>>>>>> + * - ``V4L2_CTRL_TYPE_FIXED_POINT``
> >>>>>>>>>> + - n/a
> >>>>>>>>>> + - n/a
> >>>>>>>>>> + - n/a
> >>>>>>>>>> + - A struct :c:type:`v4l2_ctrl_fixed_point`, containing parameter which has
> >>>>>>>>>> + integer part and fractional part, i.e. audio sample rate.
> >>>>>>>>>> +
> >>>>>>>>>>
> >>>>>>>>>> .. raw:: latex
> >>>>>>>>>>
> >>>>>>>>>> diff --git a/Documentation/userspace-api/media/videodev2.h.rst.exceptions b/Documentation/userspace-api/media/videodev2.h.rst.exceptions
> >>>>>>>>>> index e61152bb80d1..2faa5a2015eb 100644
> >>>>>>>>>> --- a/Documentation/userspace-api/media/videodev2.h.rst.exceptions
> >>>>>>>>>> +++ b/Documentation/userspace-api/media/videodev2.h.rst.exceptions
> >>>>>>>>>> @@ -167,6 +167,7 @@ replace symbol V4L2_CTRL_TYPE_AV1_SEQUENCE :c:type:`v4l2_ctrl_type`
> >>>>>>>>>> replace symbol V4L2_CTRL_TYPE_AV1_TILE_GROUP_ENTRY :c:type:`v4l2_ctrl_type`
> >>>>>>>>>> replace symbol V4L2_CTRL_TYPE_AV1_FRAME :c:type:`v4l2_ctrl_type`
> >>>>>>>>>> replace symbol V4L2_CTRL_TYPE_AV1_FILM_GRAIN :c:type:`v4l2_ctrl_type`
> >>>>>>>>>> +replace symbol V4L2_CTRL_TYPE_FIXED_POINT :c:type:`v4l2_ctrl_type`
> >>>>>>>>>>
> >>>>>>>>>> # V4L2 capability defines
> >>>>>>>>>> replace define V4L2_CAP_VIDEO_CAPTURE device-capabilities
> >>>>>>>>>> diff --git a/drivers/media/v4l2-core/v4l2-ctrls-core.c b/drivers/media/v4l2-core/v4l2-ctrls-core.c
> >>>>>>>>>> index a662fb60f73f..7a616ac91059 100644
> >>>>>>>>>> --- a/drivers/media/v4l2-core/v4l2-ctrls-core.c
> >>>>>>>>>> +++ b/drivers/media/v4l2-core/v4l2-ctrls-core.c
> >>>>>>>>>> @@ -1168,6 +1168,8 @@ static int std_validate_compound(const struct v4l2_ctrl *ctrl, u32 idx,
> >>>>>>>>>> if (!area->width || !area->height)
> >>>>>>>>>> return -EINVAL;
> >>>>>>>>>> break;
> >>>>>>>>>> + case V4L2_CTRL_TYPE_FIXED_POINT:
> >>>>>>>>>> + break;
> >>>>>>>>>
> >>>>>>>>> Hmm, this would need this patch 'v4l2-ctrls: add support for V4L2_CTRL_WHICH_MIN/MAX_VAL':
> >>>>>>>>>
> >>>>>>>>> https://patchwork.linuxtv.org/project/linux-media/patch/20231010022136.1504015-7-yunkec@xxxxxxxxxx/
> >>>>>>>>>
> >>>>>>>>> since min and max values are perfectly fine for a fixed point value.
> >>>>>>>>>
> >>>>>>>>> Even a step value (currently not supported in that patch) would make sense.
> >>>>>>>>>
> >>>>>>>>> But I wonder if we couldn't simplify this: instead of creating a v4l2_ctrl_fixed_point,
> >>>>>>>>> why not represent the fixed point value as a Q31.32. Then the standard
> >>>>>>>>> minimum/maximum/step values can be used, and it acts like a regular V4L2_TYPE_INTEGER64.
> >>>>>>>>>
> >>>>>>>>> Except that both userspace and drivers need to multiply it with 2^-32 to get the actual
> >>>>>>>>> value.
> >>>>>>>>>
> >>>>>>>>> So in enum v4l2_ctrl_type add:
> >>>>>>>>>
> >>>>>>>>> V4L2_CTRL_TYPE_FIXED_POINT = 10,
> >>>>>>>>>
> >>>>>>>>> (10, because it is no longer a compound type).
> >>>>>>>>
> >>>>>>>> Seems we don't need V4L2_CTRL_TYPE_FIXED_POINT, just use V4L2_TYPE_INTEGER64?
> >>>>>>>>
> >>>>>>>> The reason I use the 'integer' and 'fractional' is that I want
> >>>>>>>> 'integer' to be the normal sample
> >>>>>>>> rate, for example 48kHz. The 'fractional' is the difference with
> >>>>>>>> normal sample rate.
> >>>>>>>>
> >>>>>>>> For example, the rate = 47998.12345. so integer = 48000, fractional= -1.87655.
> >>>>>>>>
> >>>>>>>> So if we use s64 for rate, then in driver need to convert the rate to
> >>>>>>>> the closed normal
> >>>>>>>> sample rate + fractional.
> >>>>>>>
> >>>>>>> That wasn't what the documentation said :-)
> >>>>>>>
> >>>>>>> So this is really two controls: one for the 'normal sample rate' (whatever 'normal'
> >>>>>>> means in this context) and the offset to the actual sample rate.
> >>>>>>>
> >>>>>>> Presumably the 'normal' sample rate is set once, while the offset changes
> >>>>>>> regularly.
> >>>>>>>
> >>>>>>> But why do you need the 'normal' sample rate? With audio resampling I assume
> >>>>>>> you resample from one rate to another, so why do you need a third 'normal'
> >>>>>>> rate?
> >>>>>>>
> >>>>>>
> >>>>>> 'Normal' rate is used to select the prefilter table.
> >>>>>>
> >>>>>
> >>>>> Currently I think we may define
> >>>>> V4L2_CID_M2M_AUDIO_SOURCE_RATE
> >>>>> V4L2_CID_M2M_AUDIO_DEST_RATE
> >>>>
> >>>> That makes sense.
> >>>>
> >>>>> V4L2_CID_M2M_AUDIO_ASRC_RATIO_MOD
> >>>>
> >>>> OK, can you document this control? Just write it down in the reply, I just want
> >>>> to understand how the integer value you set here is used.
> >>>>
> >>>
> >>> It is Q31 value. It is equal to:
> >>> in_rate_new / out_rate_new - in_rate_old / out_rate_old
> >>
> >> So that's not an integer. Also, Q31 is limited to -1...1, and I think
> >> that's too limiting.
> >>
> >> For this having a Q31.32 fixed point type still makes a lot of sense.
> >>
> >> I still feel this is a overly complicated API.
> >>
> >> See more below...
> >>
> >>>
> >>> Best regards
> >>> Wang shengjiu
> >>>
> >>>> Regards,
> >>>>
> >>>> Hans
> >>>>
> >>>>>
> >>>>> All of them can be V4L2_CTRL_TYPE_INTEGER.
> >>>>>
> >>>>> RATIO_MOD was defined in the very beginning version.
> >>>>> I think it is better to let users calculate this value.
> >>>>>
> >>>>> The reason is:
> >>>>> if we define the offset for source rate and dest rate in
> >>>>> driver separately, when offset of source rate is set,
> >>>>> driver don't know if it needs to wait or not the dest rate
> >>>>> offset, then go to calculate the ratio_mod.
> >>
> >> Ah, in order to update the ratio mod userspace needs to set both source and
> >> dest rate at the same time to avoid race conditions.
> >>
> >> That is perfectly possible in the V4L2 control framework. See:
> >>
> >> https://linuxtv.org/downloads/v4l-dvb-apis-new/driver-api/v4l2-controls.html#control-clusters
> >>
> >> In practice, isn't it likely that you would fix either the source or
> >> destination rate, and let the other rate fluctuate? It kind of feels weird
> >> to me that both source AND destination rates can fluctuate over time.
> >>
> > Right, the source and dest rates needn't change in same time.
> >
> >> In any case, with a control cluster it doesn't really matter, you can set
> >> one rate or both rates, and it will be handled atomically.
> >>
> >> I feel that the RATIO_MOD control is too hardware specific. This is something
> >> that should be hidden in the driver.
> >>
> >
> > I will use:
> >
> > V4L2_CID_M2M_AUDIO_SOURCE_RATE
> > V4L2_CID_M2M_AUDIO_DEST_RATE
> > V4L2_CID_M2M_AUDIO_SOURCE_RATE_OFFSET
> > V4L2_CID_M2M_AUDIO_DEST_RATE_OFFSET
> >
> > 'OFFSET' is V4L2_CTRL_TYPE_FIXED_POINT, which is Q31.32.
>
> So now I come back to my original question: why do you need both
> the rate and the offset? Isn't it enough to set just the rates,
> as long as that is in fixed point format?
>
> Why does the driver need both the 'ideal' rate + the offset?
>
> I'm not opposed to this, I'm just trying to understand whether this
> makes sense.
>
> Can't you take e.g. the source and dest rate as starting points
> when you start streaming? And every time userspace updates one or both
> of these rates you calculate the ratio_mod compared to the previous rates?
>
> Or is there a reason why you need the ideal rates as well? E.g. 48000 or
> 44100, etc.
>
ideal rates is used to select prefilter table. the prefilter table is indexed by
ideal rates.
The asrc has two part: prefilter and resampler filter. The convert ratio is
applied to rasampler filter part.
best regards
wang shengjiu