Re: [PATCH 4/4] media: uvcvideo: Do not add clock samples with small sof delta

From: Ricardo Ribalda

Date: Mon May 11 2026 - 17:37:47 EST

Hi Hans

(Hi Laurent :P)

On Mon, 11 May 2026 at 20:33, Hans de Goede <hansg@xxxxxxxxxx> wrote:
>
> Hi,
>
> On 11-May-26 18:50, Ricardo Ribalda wrote:
> > Hi Hans
> >
> > On Mon, 11 May 2026 at 18:07, Hans de Goede <hansg@xxxxxxxxxx> wrote:
> >>
> >> Hi,
> >>
> >> On 23-Mar-26 14:10, Ricardo Ribalda wrote:
> >>> Some UVC 1.1 cameras running in fast isochronous mode tend to spam the
> >>> USB host with a lot of empty packets. These packets contain clock
> >>> information and are added to the clock buffer but do not add any
> >>> accuracy to the calculation. In fact, it is quite the opposite, in our
> >>> calculations, only the first and the last timestamp is used, and we only
> >>> have 32 slots.
> >>>
> >>> Ignore the samples that will produce less than MIN_HW_TIMESTAMP_DIFF
> >>> data.
> >>>
> >>> Fixes: 141270bd95d4 ("media: uvcvideo: Refactor clock circular buffer")
> >>> Cc: stable@xxxxxxxxxxxxxxx
> >>> Signed-off-by: Ricardo Ribalda <ribalda@xxxxxxxxxxxx>
> >>> ---
> >>> drivers/media/usb/uvc/uvc_video.c | 18 ++++++++++++++++--
> >>> 1 file changed, 16 insertions(+), 2 deletions(-)
> >>>
> >>> diff --git a/drivers/media/usb/uvc/uvc_video.c b/drivers/media/usb/uvc/uvc_video.c
> >>> index dcbc0941ffe6..e1a4e84d6841 100644
> >>> --- a/drivers/media/usb/uvc/uvc_video.c
> >>> +++ b/drivers/media/usb/uvc/uvc_video.c
> >>> @@ -544,6 +544,19 @@ static void uvc_video_clock_add_sample(struct uvc_clock *clock,
> >>> spin_unlock_irqrestore(&clock->lock, flags);
> >>> }
> >>>
> >>> +static inline u16 sof_diff(u16 a, u16 b)
> >>> +{
> >>> + u32 aux;
> >>> +
> >>> + a &= 2047;
> >>> + b &= 2047;
> >>> + if (a >= b)
> >>> + return a - b;
> >>> +
> >>> + aux = a + 2048;
> >>> + return (u16)(aux - b);
> >>> +}
> >>> +
> >>> static void
> >>> uvc_video_clock_decode(struct uvc_streaming *stream, struct uvc_buffer *buf,
> >>> const u8 *data, int len)
> >>> @@ -664,12 +677,13 @@ uvc_video_clock_decode(struct uvc_streaming *stream, struct uvc_buffer *buf,
> >>> sample.dev_sof = (sample.dev_sof + stream->clock.sof_offset) & 2047;
> >>>
> >>> /*
> >>> - * To limit the amount of data, drop SCRs with an SOF identical to the
> >>> + * To limit the amount of data, drop SCRs with an SOF similar to the
> >>> * previous one. This filtering is also needed to support UVC 1.5, where
> >>> * all the data packets of the same frame contains the same SOF. In that
> >>> * case only the first one will match the host_sof.
> >>> */
> >>> - if (sample.dev_sof == stream->clock.last_sof)
> >>> + if (sof_diff(sample.dev_sof, stream->clock.last_sof) <=
> >>> + (MIN_HW_TIMESTAMP_DIFF / stream->clock.size))
> >>> return;
> >>
> >> If I understand things correctly then uvc_video_clock_update() uses
> >> first->host_time + some correction time. But you might end up not
> >> storing a sample for the very first isochronous USB packet of a frame
> >> because of this new check. Which means that the first->host_time used
> >> as a starting point for the timestamp just has become inaccurate ?
> >
> > In UVC 1.5 All the ISOC packets have the same dev_sof and dev_stc.
> > So this check will avoid adding a whole frame into the timestamp
> > circular buffer when running at more than 320 Hz (1/(0.1/32))
> >
> > In UVC 1.1 all ISOC packets have the same dev_stc but different dev_sof.
> > This check will avoid adding some of those packets into the circular
> > buffer, but the accuracy will not be lost. We will use the data from
> > the neighbour packets (even from previous frames) to recover the sof.
> >
> > The biggest winner for this patch is UVC 1.1, which will have much
> > more accurate timestamps, because the distance between the first and
> > last will be bigger (as in uvc1.5)
>
> I'm still trying to wrap my head about the whole concept of the hw
> timestamps TBH.
>
> Upon reading it a couple of times I now see that when exactly we
> take samples is not important because the actual frame time in
> STC units is stored in buf->pts and that is supposed to be our
> starting point. And the rest is just used to calculate
> a factor + offset.
>
> At least that is what the big comment says but I'm confused by
> the code which is supposed to implement:
>
> * SOF = (SOF2 - SOF1) / (STC2 - STC1) * PTS
> * + (SOF1 * STC2 - SOF2 * STC1) / (STC2 - STC1)
> *
> * or
> *
> * SOF = ((SOF2 - SOF1) * PTS + SOF1 * STC2 - SOF2 * STC1) / (STC2 - STC1) (1)
>
> I think that the code tries to implement the second formula:
>
> We've (with some checks removed):
>
> /* First step, PTS to SOF conversion. */
> delta_stc = buf->pts - (1UL << 31);
> x1 = first->dev_stc - delta_stc;
> x2 = last->dev_stc - delta_stc;
>
> y1 = (first->dev_sof + 2048) << 16;
> y2 = (last->dev_sof + 2048) << 16;
> if (y2 < y1)
> y2 += 2048 << 16;
>
> y = (u64)(y2 - y1) * (1ULL << 31) + (u64)y1 * (u64)x2
> - (u64)y2 * (u64)x1;
> y = div_u64(y, x2 - x1);
>
> sof = y;
>
> Simplifying this by removing all the range-shifting
> and using sof1/sof2 instead of y1/y2 like in the comment
> we end up with:
>
> x1 = first->dev_stc - buf->pts;
> x2 = last->dev_stc - buf->pts;
>
> sof1 = first->dev_sof;
> sof2 = last->dev_sof
>
> sof = ((sof2 - sof1) + sof1 * x2 - sof2 * x1) / (x2 - x1)
>
> Now substitute stc1/stc2 for first->dev_stc / last->dev_stc
> and just pts for buf->pts and expand x1 + x2 we get:
>
> sof = ((sof2 - sof1) + sof1 * (stc2 - pts) - sof2 * (stc1 - pts)) /
> ((stc2 - pts) - (stc1 - pts))

I think this is where your explanation goes slightly off:

x2 is actually stc2 - pts + (1UL << 31), and x1 is stc1 - pts + (1UL << 31).

Before you scream at me, look at the end of the mail! :P

>
> We can simplify the divisor here by getting rid of the pts bit
> since the 2 "- pts" parts negate each other:
>
> sof = ((sof2 - sof1) + sof1 * (stc2 - pts) - sof2 * (stc1 - pts)) /
> (stc2 - stc1)
>
> Now lets get rid of the () from expanding x1 / x2:
>
> sof = ((sof2 - sof1) + sof1 * stc2 - sof1 * pts - sof2 * stc1 + sof2 * pts)) /
> (stc2 - stc1)
>
> Shuffle bringing " * pts" parts to the front:
>
> sof = (sof2 * pts - sof1 * pts + (sof2 - sof1) + sof1 * stc2 - sof2 * stc1)) /
> (stc2 - stc1)
>
> Simplify:
>
> sof = ((sof2 - sof1) * pts + (sof2 - sof1) + sof1 * stc2 - sof2 * stc1) /
> (stc2 - stc1)
>
> Looks a lot like the comment except there is a + (sof2 - sof1) too much
> in there ?
>
> And some of the range shifting also feels wrong. As long as we're only
> subtracting the range shifting is fine. But as soon as we start multiplying
> variables in different shifted ranges the end result actually changes.
>
> Especially weird here is that we range-shift by (1UL << 31) for calculating
> delta_stc and then *multiply* (y2 - y1) by (1ULL << 31) I guess this is
> to compensate for the (1ULL << 31) component of x1/x2 but the first->dev_stc
> and pts parts of x1 where never multiplied by (1ULL << 31) so these
> are still in their original *scale*. Either we should multiply all
> parts to go to some other fixed scale and the sof value are both range-shifted
> by 2048 as well as multiplied by 65536, which also seems wrong to me as
> soon as we do sof1 * stc2 or sof2 * stc1
>
> All in all this all feels like there are some issues lurking here and it
> does not seem to match the comment at the top.
>
> Regards,
>
> Hans
>
>
>

Lets go back to the beggining:

SOF = ((SOF2 - SOF1) * PTS + SOF1 * STC2 - SOF2 * STC1) / (STC2 - STC1)

This is the formula for a straight line when you know two points. The
names are super ugly, lets use something we are more used to:

y = ((y2 - y1) * x + y1 * x2 - y2 * x1) / (x2 - x1) ;

Ok. how would this look if we want x to be exactly at (1 << 31) to
prevent unsigned underflow? ?

we just have to move things around:

delta = x - (1<<31);

new_x1 = x1 - delta = x1 - x + (1<<31)
new_x2 = x2 - delta = x2 - x + (1<<31)

We plug this in the formula and:

y = ((y2-y1) * (1 <<31) + y1 * new_x2 - y2*new_x1) /(new_x2-new_x1);
Which is exactly what we have.

Now lets look at the scaling:

delta_stc = buf->pts - (1UL << 31);
x1 = first->dev_stc - delta_stc;
x2 = last->dev_stc - delta_stc;

X1 and X2 are NOT scaled

y1 = (first->dev_sof + 2048) << 16;
y2 = (last->dev_sof + 2048) << 16;
if (y2 < y1)
y2 += 2048 << 16;

Y1 and Y2 is scaled 16 (ignore the +2048, the variable is mod(2048))

y = (u64)(y2 - y1) * (1ULL << 31) + (u64)y1 * (u64)x2
- (u64)y2 * (u64)x1;
y = div_u64(y, x2 - x1);

y = (SCALE16 - SCALE16)*K + SCALE16*SCALE1 - SCALE16*SCALE1; => Result
is SCALE16
y = div64(SCALE16, SCALE1) => Result is SCALE16

So it looks good to me. It is a painful code, but I think it is correct.

(Painful, but I would probably fail to do it better :P)

Regards

--
Ricardo Ribalda