Re: [RFC PATCH v2 3/8] timekeeping: Clamp time_offset delta to prevent infinite tail

From: David Woodhouse

Date: Tue May 19 2026 - 11:14:30 EST


On Tue, 2026-05-19 at 16:17 +0200, Miroslav Lichvar wrote:
> On Tue, May 19, 2026 at 02:31:41PM +0100, David Woodhouse wrote:
> > On Tue, 2026-05-19 at 15:25 +0200, Miroslav Lichvar wrote:
> > > I don't think that is an acceptable change of the filter. The impact
> > > could be measured on a sufficiently stable clock.
> > >
> > > To me that looks like using a wrong tool for the job.
> >
> > I chose 20ns/s because it's fairly much in the noise of the existing
> > jitter. The idea here is that there's no change in the initial part of
> > the exponential delivery of time_offset, but the long asymptotic tail
> > ends up applying a skew per second which is *far* smaller than the
> > inter-tick jitter of the output anyway, which seems pointless?
>
> It changes the initial part too. Consider a case where the PLL time
> constant is set to 0 and the application is updating the PLL once per
> second. ntp_offset_chunk() returns 1/4th of time_offset. If the
> NTP/PTP measurements are stable to about 20 nanoseconds, the clock
> corrections will be 4 times larger than expected.
>
> By inter-tick jitter you mean the +1/0 multiplier changes? That
> can be below 1 nanosecond if the clock is updated frequently enough
> and the multiplier is sufficient large.

> > Without it, the output basically *never* converges to the desired line.
>
> I think it's not supposed to get to zero. It is expected to be updated
> regularly with new measurements.

Fair enough. I think I'm happy to drop this. Much of my testing for the
ntp_error and time_offset fixes has been in a completely artificial
environment where I *stop* chrony on the host, advertise a single
(stale) rate through vmclock, and make sure the core timekeeping *can*
converge to that without constantly drifting due to the tracking errors
that I've fixed. The infinite convergence was messing with that, but I
guess it won't matter much in the real world.

My test is calling ktime_get_snapshot() and comparing the resulting
CLOCK_REALTIME with the vmclock time calculated from the *same* TSC
value, and printing that difference every 500ms:

(This is from a test case where I deliberately introduced 2µs offset
after the initial convergence, to test that it injects precisely
2000ns, no more and no less).

[ 50.900372] vmclock_cmp: diff=-2003ns tsc=1ca1714991
[ 51.404369] vmclock_cmp: diff=-1999ns tsc=1ce98a34e9
[ 51.908369] vmclock_cmp: diff=-2001ns tsc=1d31a33821
[ 52.412365] vmclock_cmp: diff=-2003ns tsc=1d79bc1c45
[ 52.916364] vmclock_cmp: diff=-2005ns tsc=1dc1d5189d
[ 53.420360] vmclock_cmp: diff=-2003ns tsc=1e09edfcc9
[ 53.924361] vmclock_cmp: diff=-2001ns tsc=1e52070cd1
[ 54.428370] vmclock_cmp: diff=-2007ns tsc=1e9a206b9d
[ 54.932360] vmclock_cmp: diff=-2002ns tsc=1ee2391235
[ 55.436372] vmclock_cmp: diff=-2003ns tsc=1f2a528a9d
[ 55.940368] vmclock_cmp: diff=-1999ns tsc=1f726b6e91
[ 56.444363] vmclock_cmp: diff=-2001ns tsc=1fba844d09
[ 56.948363] vmclock_cmp: diff=-2001ns tsc=20029d5251
[ 57.452384] vmclock_cmp: diff=-1997ns tsc=204ab72295
[ 57.956363] vmclock_cmp: diff=-2002ns tsc=2092cf5f5d
[ 58.460367] vmclock_cmp: diff=-2002ns tsc=20dae89265
[ 58.964374] vmclock_cmp: diff=-2001ns tsc=212301d9cd
[ 59.468366] vmclock_cmp: diff=-2002ns tsc=216b1a91c1
[ 59.972370] vmclock_cmp: diff=-2001ns tsc=21b333c671
[ 60.476364] vmclock_cmp: diff=-1998ns tsc=21fb4c9295

So there's still a jitter of single-digit nanoseconds, which is why I
figured a minimum for the *deliberate* skew of 20ns/s was negligibly
into the noise. But I'm happy to drop it.


Attachment: smime.p7s
Description: S/MIME cryptographic signature