Re: [RFC PATCH v3 02/10] timekeeping: Remove xtime_remainder from ntp_error accumulation

From: David Woodhouse

Date: Sat May 30 2026 - 06:54:55 EST


On Fri, 2026-05-29 at 16:21 -0700, John Stultz wrote:
> On Fri, May 22, 2026 at 4:41 AM David Woodhouse <dwmw2@xxxxxxxxxxxxx> wrote:
> >
> > On Wed, 2026-05-20 at 14:33 +0100, David Woodhouse wrote:
> > > From: David Woodhouse <dwmw@xxxxxxxxxxxx>
> > >
> > > The ntp_error accumulator tracks the difference between the time actually
> > > reported to consumers in xtime, and the *intended* time. The former is
> > > subject to a sawtooth effect due to the quantisation of 'mult', which
> > > means that it actually advances by 'xtime_interval' each tick, while
> > > the intended clock advances by 'ntp_tick'.
> > >
> > > By dithering between adjacent integer values of 'mult' which result in
> > > an 'xtime_interval' slightly higher/lower than the intended tick length,
> > > the advancement of xtime is kept on average to the intended rate.
> > >
> > > The accounting should therefore adjust ntp_error by adding ntp_tick and
> > > subtracting xtime_interval on each tick.
> > >
> > > Since commit a386b5af8edd ("time: Compensate for rounding on
> > > odd-frequency clocksources") the value subtracted has been
> > > (xtime_interval + xtime_remainder), which is wrong. The effect is a
> > > systematic drift whose magnitude depends on the value of xtime_remainder
> > > and the NTP frequency correction. NTP masks this by continuously
> > > adjusting the frequency to compensate, but with a fixed frequency (or an
> > > external reference clock like vmclock), the drift is exposed.
> > >
> > > The value of xtime_remainder actually does represent the difference
> > > between the tick period and xtime_interval, so simply adding it instead
> > > of (+ tick length - xtime_remainder) might have made sense... except
> > > that it's only calculated once at boot time, so it's inaccurate anyway.
> > > So just kill it with fire.
> > >
> > > Also remove it from the mult computation in timekeeping_adjust(), which
> > > used it to offset the division for the same (incorrect) reason.
> > >
> > > Fixes: a386b5af8edd ("time: Compensate for rounding on odd-frequency clocksources")
> > > Signed-off-by: David Woodhouse <dwmw@xxxxxxxxxxxx>
> > > Assisted-by: Kiro:claude-opus-4.6-1m
> >
> > Hi John, I note you skipped this one and acked some of the rest of the
> > series. Do you have concerns?
> >
>
> Sorry, I've been busy with other things and haven't had much time to
> sit and think through the details on this one.
>
> You've pointed out well that the error in calculating the mult value
> (basically what gets rounded off) does get handled in the ntp_err and
> the resulting mult adjustments.
>
> My concern has mostly been for coarse grained clocksources, the error
> in the computed cycle_interval may not line up well with the desired
> ntp_interval and could be a secondary cause of error.

I'm tracking nanosecond precision. They're *all* coarse-grained to me :)

But you may be right. Not 'secondary cause of error' per se, but
perhaps not quite desirable behaviour...

The issue with coarser-grained clocksources is that the closest integer
multiple (mult) of cycle_interval is further from the intended tick
length.

For ACPI PM timer at 3.579545 MHz and CONFIG_HZ=1000 we end up with a
cycle_interval of 3579545 / 1000 = 3579.545 which tk_setup_internals()
rounds up to 3580.

With a perfect oscillator, that means each tick (3580 counts) actually
takes 1000124.7 ns (+125PPM vs. the theoretical), and we set 'mult' for
CLOCK_MONOTONIC_RAW accordingly (and for CLOCK_MONOTONIC too before NTP
kicks in).

So *that* is why we add 'ntp_tick + xtime_remainder' to one side of the
balance (while subtracting the actual xtime_interval). It's because
that basically *is* part of the tick length that we know 3580 counts to
be.

It's not entirely clear from the variable name or the commit message
that this was truly understood :)

Since NTP only works on ratios, I was right that it's a constant factor
that NTP can skew away, but that +125PPM is actually a significant
proportion of the maximum ±500PPM skew that adjtimex() will permit, so
there's only another +375PPM tolerance for the oscillator to actually
vary.

I switched https://david.woodhou.se/ntptest-virt/ to acpi_pm and you
can see it clearly in the frequency correction graph: my patched hosts
are showing precisely that -125PPM correction because of the tick
length.

So xtime_remainder is basically a bias to bring the behaviour of a
perfect oscillator right into the middle of the ±500PPM correction
band.

I will look at renaming it, or folding it into tick_length since that's
*actually* what it is (perhaps tk_setup_internals() should be setting
ntpinterval to cycle_interval * mult as a much cleaner way of achieving
the same thing, although ntp_update_frequency() would need to cope?)

At the very least, I can document it better. And perhaps I can make the
absolute reference tracking take it into account as part of
tick_length.

Thanks for nudging me to look closer.

Attachment: smime.p7s
Description: S/MIME cryptographic signature