Re: [PATCH 4/4] media: uvcvideo: Do not add clock samples with small sof delta
From: Hans de Goede
Date: Tue May 12 2026 - 04:46:53 EST
HI,
On 11-May-26 23:36, Ricardo Ribalda wrote:
> Hi Hans
>
> (Hi Laurent :P)
>
>
>
> On Mon, 11 May 2026 at 20:33, Hans de Goede <hansg@xxxxxxxxxx> wrote:
>>
>> Hi,
>>
>> On 11-May-26 18:50, Ricardo Ribalda wrote:
>>> Hi Hans
>>>
>>> On Mon, 11 May 2026 at 18:07, Hans de Goede <hansg@xxxxxxxxxx> wrote:
>>>>
>>>> Hi,
>>>>
>>>> On 23-Mar-26 14:10, Ricardo Ribalda wrote:
>>>>> Some UVC 1.1 cameras running in fast isochronous mode tend to spam the
>>>>> USB host with a lot of empty packets. These packets contain clock
>>>>> information and are added to the clock buffer but do not add any
>>>>> accuracy to the calculation. In fact, it is quite the opposite, in our
>>>>> calculations, only the first and the last timestamp is used, and we only
>>>>> have 32 slots.
>>>>>
>>>>> Ignore the samples that will produce less than MIN_HW_TIMESTAMP_DIFF
>>>>> data.
>>>>>
>>>>> Fixes: 141270bd95d4 ("media: uvcvideo: Refactor clock circular buffer")
>>>>> Cc: stable@xxxxxxxxxxxxxxx
>>>>> Signed-off-by: Ricardo Ribalda <ribalda@xxxxxxxxxxxx>
>>>>> ---
>>>>> drivers/media/usb/uvc/uvc_video.c | 18 ++++++++++++++++--
>>>>> 1 file changed, 16 insertions(+), 2 deletions(-)
>>>>>
>>>>> diff --git a/drivers/media/usb/uvc/uvc_video.c b/drivers/media/usb/uvc/uvc_video.c
>>>>> index dcbc0941ffe6..e1a4e84d6841 100644
>>>>> --- a/drivers/media/usb/uvc/uvc_video.c
>>>>> +++ b/drivers/media/usb/uvc/uvc_video.c
>>>>> @@ -544,6 +544,19 @@ static void uvc_video_clock_add_sample(struct uvc_clock *clock,
>>>>> spin_unlock_irqrestore(&clock->lock, flags);
>>>>> }
>>>>>
>>>>> +static inline u16 sof_diff(u16 a, u16 b)
>>>>> +{
>>>>> + u32 aux;
>>>>> +
>>>>> + a &= 2047;
>>>>> + b &= 2047;
>>>>> + if (a >= b)
>>>>> + return a - b;
>>>>> +
>>>>> + aux = a + 2048;
>>>>> + return (u16)(aux - b);
>>>>> +}
>>>>> +
>>>>> static void
>>>>> uvc_video_clock_decode(struct uvc_streaming *stream, struct uvc_buffer *buf,
>>>>> const u8 *data, int len)
>>>>> @@ -664,12 +677,13 @@ uvc_video_clock_decode(struct uvc_streaming *stream, struct uvc_buffer *buf,
>>>>> sample.dev_sof = (sample.dev_sof + stream->clock.sof_offset) & 2047;
>>>>>
>>>>> /*
>>>>> - * To limit the amount of data, drop SCRs with an SOF identical to the
>>>>> + * To limit the amount of data, drop SCRs with an SOF similar to the
>>>>> * previous one. This filtering is also needed to support UVC 1.5, where
>>>>> * all the data packets of the same frame contains the same SOF. In that
>>>>> * case only the first one will match the host_sof.
>>>>> */
>>>>> - if (sample.dev_sof == stream->clock.last_sof)
>>>>> + if (sof_diff(sample.dev_sof, stream->clock.last_sof) <=
>>>>> + (MIN_HW_TIMESTAMP_DIFF / stream->clock.size))
>>>>> return;
>>>>
>>>> If I understand things correctly then uvc_video_clock_update() uses
>>>> first->host_time + some correction time. But you might end up not
>>>> storing a sample for the very first isochronous USB packet of a frame
>>>> because of this new check. Which means that the first->host_time used
>>>> as a starting point for the timestamp just has become inaccurate ?
>>>
>>> In UVC 1.5 All the ISOC packets have the same dev_sof and dev_stc.
>>> So this check will avoid adding a whole frame into the timestamp
>>> circular buffer when running at more than 320 Hz (1/(0.1/32))
>>>
>>> In UVC 1.1 all ISOC packets have the same dev_stc but different dev_sof.
>>> This check will avoid adding some of those packets into the circular
>>> buffer, but the accuracy will not be lost. We will use the data from
>>> the neighbour packets (even from previous frames) to recover the sof.
>>>
>>> The biggest winner for this patch is UVC 1.1, which will have much
>>> more accurate timestamps, because the distance between the first and
>>> last will be bigger (as in uvc1.5)
>>
>> I'm still trying to wrap my head about the whole concept of the hw
>> timestamps TBH.
>>
>> Upon reading it a couple of times I now see that when exactly we
>> take samples is not important because the actual frame time in
>> STC units is stored in buf->pts and that is supposed to be our
>> starting point. And the rest is just used to calculate
>> a factor + offset.
>>
>> At least that is what the big comment says but I'm confused by
>> the code which is supposed to implement:
>>
>> * SOF = (SOF2 - SOF1) / (STC2 - STC1) * PTS
>> * + (SOF1 * STC2 - SOF2 * STC1) / (STC2 - STC1)
>> *
>> * or
>> *
>> * SOF = ((SOF2 - SOF1) * PTS + SOF1 * STC2 - SOF2 * STC1) / (STC2 - STC1) (1)
>>
>> I think that the code tries to implement the second formula:
>>
>> We've (with some checks removed):
>>
>> /* First step, PTS to SOF conversion. */
>> delta_stc = buf->pts - (1UL << 31);
>> x1 = first->dev_stc - delta_stc;
>> x2 = last->dev_stc - delta_stc;
>>
>> y1 = (first->dev_sof + 2048) << 16;
>> y2 = (last->dev_sof + 2048) << 16;
>> if (y2 < y1)
>> y2 += 2048 << 16;
>>
>> y = (u64)(y2 - y1) * (1ULL << 31) + (u64)y1 * (u64)x2
>> - (u64)y2 * (u64)x1;
>> y = div_u64(y, x2 - x1);
>>
>> sof = y;
>>
>> Simplifying this by removing all the range-shifting
>> and using sof1/sof2 instead of y1/y2 like in the comment
>> we end up with:
>>
>> x1 = first->dev_stc - buf->pts;
>> x2 = last->dev_stc - buf->pts;
>>
>> sof1 = first->dev_sof;
>> sof2 = last->dev_sof
>>
>> sof = ((sof2 - sof1) + sof1 * x2 - sof2 * x1) / (x2 - x1)
>>
>> Now substitute stc1/stc2 for first->dev_stc / last->dev_stc
>> and just pts for buf->pts and expand x1 + x2 we get:
>>
>> sof = ((sof2 - sof1) + sof1 * (stc2 - pts) - sof2 * (stc1 - pts)) /
>> ((stc2 - pts) - (stc1 - pts))
>
>
>
> I think this is where your explanation goes slightly off:
>
> x2 is actually stc2 - pts + (1UL << 31), and x1 is stc1 - pts + (1UL << 31).
>
> Before you scream at me, look at the end of the mail! :P
>
>
>>
>> We can simplify the divisor here by getting rid of the pts bit
>> since the 2 "- pts" parts negate each other:
>>
>> sof = ((sof2 - sof1) + sof1 * (stc2 - pts) - sof2 * (stc1 - pts)) /
>> (stc2 - stc1)
>>
>> Now lets get rid of the () from expanding x1 / x2:
>>
>> sof = ((sof2 - sof1) + sof1 * stc2 - sof1 * pts - sof2 * stc1 + sof2 * pts)) /
>> (stc2 - stc1)
>>
>> Shuffle bringing " * pts" parts to the front:
>>
>> sof = (sof2 * pts - sof1 * pts + (sof2 - sof1) + sof1 * stc2 - sof2 * stc1)) /
>> (stc2 - stc1)
>>
>> Simplify:
>>
>> sof = ((sof2 - sof1) * pts + (sof2 - sof1) + sof1 * stc2 - sof2 * stc1) /
>> (stc2 - stc1)
>>
>> Looks a lot like the comment except there is a + (sof2 - sof1) too much
>> in there ?
>>
>> And some of the range shifting also feels wrong. As long as we're only
>> subtracting the range shifting is fine. But as soon as we start multiplying
>> variables in different shifted ranges the end result actually changes.
>>
>> Especially weird here is that we range-shift by (1UL << 31) for calculating
>> delta_stc and then *multiply* (y2 - y1) by (1ULL << 31) I guess this is
>> to compensate for the (1ULL << 31) component of x1/x2 but the first->dev_stc
>> and pts parts of x1 where never multiplied by (1ULL << 31) so these
>> are still in their original *scale*. Either we should multiply all
>> parts to go to some other fixed scale and the sof value are both range-shifted
>> by 2048 as well as multiplied by 65536, which also seems wrong to me as
>> soon as we do sof1 * stc2 or sof2 * stc1
>>
>> All in all this all feels like there are some issues lurking here and it
>> does not seem to match the comment at the top.
>>
>> Regards,
>>
>> Hans
>>
>>
>>
>
> Lets go back to the beggining:
>
> SOF = ((SOF2 - SOF1) * PTS + SOF1 * STC2 - SOF2 * STC1) / (STC2 - STC1)
>
> This is the formula for a straight line when you know two points. The
> names are super ugly, lets use something we are more used to:
>
> y = ((y2 - y1) * x + y1 * x2 - y2 * x1) / (x2 - x1) ;
>
> Ok. how would this look if we want x to be exactly at (1 << 31) to
> prevent unsigned underflow? ?
>
> we just have to move things around:
>
> delta = x - (1<<31);
>
> new_x1 = x1 - delta = x1 - x + (1<<31)
> new_x2 = x2 - delta = x2 - x + (1<<31)
>
> We plug this in the formula and:
>
> y = ((y2-y1) * (1 <<31) + y1 * new_x2 - y2*new_x1) /(new_x2-new_x1);
> Which is exactly what we have.
>
>
> Now lets look at the scaling:
>
> delta_stc = buf->pts - (1UL << 31);
> x1 = first->dev_stc - delta_stc;
> x2 = last->dev_stc - delta_stc;
>
> X1 and X2 are NOT scaled
>
> y1 = (first->dev_sof + 2048) << 16;
> y2 = (last->dev_sof + 2048) << 16;
> if (y2 < y1)
> y2 += 2048 << 16;
>
> Y1 and Y2 is scaled 16 (ignore the +2048, the variable is mod(2048))
>
> y = (u64)(y2 - y1) * (1ULL << 31) + (u64)y1 * (u64)x2
> - (u64)y2 * (u64)x1;
> y = div_u64(y, x2 - x1);
>
> y = (SCALE16 - SCALE16)*K + SCALE16*SCALE1 - SCALE16*SCALE1; => Result
> is SCALE16
> y = div64(SCALE16, SCALE1) => Result is SCALE16
>
> So it looks good to me. It is a painful code, but I think it is correct.
Ok, thank you for explaining this.
Regardless of the math which true me off, the actual sampling logic
is not that complicated.
Now that I realize that the exact sample moments do not matter because
buf->pts is our time-reference for the frame and not start->host_time
as my naive first reading of the code suggested this patch seems fine:
Reviewed-by: Hans de Goede <johannes.goede@xxxxxxxxxxxxxxxx>
Regards,
Hans