Re: [PATCH v5 0/6] timekeeping/ntp: Fix drift tracking precision

From: David Woodhouse

Date: Thu Jun 11 2026 - 13:39:35 EST


On Wed, 2026-06-10 at 23:30 +0100, David Woodhouse wrote:
> This is the bugfix subset of the RFC series last posted at
>   https://lore.kernel.org/all/20260520135207.37826-1-dwmw2@xxxxxxxxxxxxx/
>
> These patches stand alone and fix several long-standing precision issues
> in the kernel's NTP drift tracking. The feed-forward clock discipline
> which was the reason I was *looking* remains waiting in the wings (and
> my ffclock branch) for now; this is just the precision fixes on which
> it will depend.

https://git.infradead.org/?p=users/dwmw2/linux.git;a=shortlog;h=refs/heads/ffclock
addresses the Sashiko feedback and will be the next round, perhaps once
meat-based reviewers have had a chance to opine.

There's a new patch to resolve the hypothetical issue of time_offset
and time_adjust are pushing in *different* directions. Previously that
could get stuck with the clock not actually moving. Now we explicitly
drain from both the amount of skew that they counteract, each second.

From: David Woodhouse <dwmw@xxxxxxxxxxxx>
Date: Thu, 11 Jun 2026 09:49:07 +0100
Subject: [PATCH 06/10] timekeeping: Settle competing time_offset and
time_adjust skew

time_offset (the exponential PLL phase slew) and time_adjust (the
linear adjtime() slew) can be asked to move the clock in opposite
directions. second_overflow() folds only their *net* into the per-tick
skew_delta, so the cancelling overlap would never be drained from
either tracker by the per-tick code -- and if they cancel exactly,
skew_delta is zero and neither converges at all.

Transfer the overlapping ("conflict") portion directly between the two
trackers each second so it is accounted without being slewed onto the
clock and later recovered. Keep time_adjust and its sub-microsecond
remainder (time_adjust_frac) the same sign while doing so, since the
drain treats abs(time_adjust_frac) as same-direction headroom; an
opposing remainder there would make it deliver phase that was never
removed from the pile, leaving a residual offset from the reference.

Signed-off-by: David Woodhouse <dwmw@xxxxxxxxxxxx>
Assisted-by: Kiro:claude-opus-4.8
---
kernel/time/ntp.c | 160 +++++++++++++++++++++++++++++++++++-----------
1 file changed, 121 insertions(+), 39 deletions(-)

diff --git a/kernel/time/ntp.c b/kernel/time/ntp.c
index 80fe27451fac..aa9a0593952d 100644
--- a/kernel/time/ntp.c
+++ b/kernel/time/ntp.c
@@ -481,6 +481,69 @@ s64 ntp_drain_time_adjust(unsigned int tkid, s64 amount, unsigned int shift)
return (amount > 0) ? amount - claimed : amount + claimed;
}

+/*
+ * time_offset (drained exponentially) and time_adjust (drained linearly at the
+ * MAX_TICKADJ rate) can be asked to slew the clock in opposite directions.
+ * second_overflow() only folds their *net* into skew_delta, so the cancelling
+ * part would never be drained from either tracker via the per-tick code -- and
+ * if they cancel exactly, skew_delta is zero and neither converges at all.
+ *
+ * Settle that cancelling phase directly between the two here. No clock motion
+ * results (the opposing slews annihilate), but both move toward zero so neither
+ * stalls. @amount is the phase to take off time_offset, in its (÷HZ) units and
+ * with its sign; the same real magnitude comes off time_adjust in the opposite
+ * direction. Clamped so neither tracker is driven past zero.
+ */
+static void ntp_transfer_offset_adjust(struct ntp_data *ntpdata, s64 amount)
+{
+ s64 frac_delta, carry;
+
+ /*
+ * Don't drain time_offset past zero. @amount shares its sign and is
+ * normally bounded below it by ntp_offset_chunk(), but the ±1 skew_delta
+ * floor for a tiny time_offset can exceed it, so clamp.
+ */
+ if (abs(amount) > abs(ntpdata->time_offset))
+ amount = ntpdata->time_offset;
+ if (!amount)
+ return;
+
+ /*
+ * Remove the matching phase from time_adjust, in plain shifted-ns. No
+ * clamp against time_adjust's zero is needed: @amount is bounded by the
+ * adjtime chunk, which second_overflow() never lets exceed time_adjust's
+ * own pending phase, so this cannot overshoot.
+ */
+ frac_delta = amount * NTP_INTERVAL_FREQ;
+
+ ntpdata->time_offset -= amount;
+
+ /* Add the matching phase to time_adjust, carrying whole µs (O(1)). */
+ ntpdata->time_adjust_frac += frac_delta;
+ if (ntpdata->time_adjust_frac >= ONE_US_NS ||
+ ntpdata->time_adjust_frac <= -ONE_US_NS) {
+ carry = ntpdata->time_adjust_frac / ONE_US_NS;
+ ntpdata->time_adjust += carry;
+ ntpdata->time_adjust_frac -= carry * ONE_US_NS;
+ }
+
+ /*
+ * Keep time_adjust and its sub-µs remainder the same sign. The
+ * truncating carry above can leave them opposed (e.g. +4 µs paired
+ * with -250 ns), and ntp_drain_time_adjust() treats abs(time_adjust_frac)
+ * as same-direction drawer capacity -- an opposing remainder there makes
+ * it over-deliver phase that was never removed from the pile. Borrow or
+ * repay a single whole µs to realign; the total phase is unchanged.
+ */
+ if (ntpdata->time_adjust > 0 && ntpdata->time_adjust_frac < 0) {
+ ntpdata->time_adjust--;
+ ntpdata->time_adjust_frac += ONE_US_NS;
+ } else if (ntpdata->time_adjust < 0 && ntpdata->time_adjust_frac > 0) {
+ ntpdata->time_adjust++;
+ ntpdata->time_adjust_frac -= ONE_US_NS;
+ }
+}
+
/**
* ntp_get_next_leap - Returns the next leapsecond in CLOCK_REALTIME ktime_t
* @tkid: Timekeeper ID
@@ -576,53 +639,72 @@ int second_overflow(unsigned int tkid, time64_t secs)
/* Compute the phase adjustment for the next second */
ntpdata->tick_length = ntpdata->tick_length_base;

- /*
- * Set the per-tick skew rate for the tick code. This is in
- * the same units as tick_length (ns << NTP_SCALE_SHIFT).
- * tick_offset is so low that the skew imparted would round to
- * zero, pass the bare minimum ±1. It won't overshoot because
- * logarithmic_accumulation() only drains what it can from
- * time_offset and the rest ends up in ntp_error which drives
- * the selection of 'mult' immediately each tick.
- */
- if (ntpdata->time_offset) {
- s64 delta = ntp_offset_chunk(ntpdata, ntpdata->time_offset);
- ntpdata->skew_delta = div_s64(delta, NTP_INTERVAL_FREQ);
-
- if (!ntpdata->skew_delta)
- ntpdata->skew_delta = (ntpdata->time_offset > 0) ? 1 : -1;
- } else {
- ntpdata->skew_delta = 0;
- }
-
/* Check PPS signal */
pps_dec_valid(ntpdata);

/*
- * Bias the per-tick skew for any pending adjtime() correction, at up
- * to MAX_TICKADJ (500us) per second. This only sizes the mult bias
- * (and hence the per-tick drain budget); time_adjust itself is drained
- * in logarithmic_accumulation() via ntp_drain_time_adjust(), per tick,
- * so it is never decremented behind the per-tick accounting's back and
- * never staged into time_offset (which would smear the exponential).
+ * Work out, in time_offset units (shifted-ns / HZ), the per-second
+ * phase each tracker wants to deliver: time_offset's exponential chunk
+ * and time_adjust's linear chunk (capped at the MAX_TICKADJ rate, and
+ * divided by HZ *once* to reach these units -- not the per-tick HZ*HZ).
+ * Settle any opposing overlap directly between the two at this
+ * precision, *before* reducing to the per-tick rate, so a small
+ * cancelling contribution is not lost to truncation. Then reduce the
+ * net to skew_delta and apply the ±1 floor to the *combined* result so
+ * any non-zero net is still delivered (it won't overshoot:
+ * logarithmic_accumulation() only drains what each tracker can take and
+ * the rest lands in ntp_error, which steers 'mult' the same tick).
*/
- if (ntpdata->time_adjust || ntpdata->time_adjust_frac) {
- long adj = clamp(ntpdata->time_adjust,
- (long)-MAX_TICKADJ, (long)MAX_TICKADJ);
+ if (ntpdata->time_offset || ntpdata->time_adjust ||
+ ntpdata->time_adjust_frac) {
+ s64 off_chunk = ntp_offset_chunk(ntpdata, ntpdata->time_offset);
+ s64 adj_chunk = 0, net;
+
+ /*
+ * Once the exponential chunk rounds to zero, deliver the last
+ * remaining offset this second so it converges to zero instead
+ * of stalling just above it.
+ */
+ if (!off_chunk)
+ off_chunk = ntpdata->time_offset;
+
+ if (ntpdata->time_adjust || ntpdata->time_adjust_frac) {
+ long adj = clamp(ntpdata->time_adjust,
+ (long)-MAX_TICKADJ, (long)MAX_TICKADJ);
+ /*
+ * Per-second linear phase, in shifted-ns: whole-us part
+ * plus the sub-us drawer, clamped to the MAX_TICKADJ
+ * rate. Including the drawer keeps it draining until the
+ * last sub-us remainder is flushed. Divide by HZ once to
+ * reach time_offset units.
+ */
+ s64 chunk = (s64)adj * ONE_US_NS + ntpdata->time_adjust_frac;
+ s64 max = (s64)MAX_TICKADJ * ONE_US_NS;
+
+ adj_chunk = div_s64(clamp(chunk, -max, max),
+ NTP_INTERVAL_FREQ);
+ }
+
/*
- * Per-second linear phase to drive, in shifted-ns: the
- * whole-us part plus the sub-us drawer, clamped to the
- * MAX_TICKADJ rate. Including the drawer ensures the drive
- * (and hence the drain) continues until the last sub-us
- * remainder is flushed, not just until time_adjust hits zero.
+ * If the two slews oppose, only their net would drive the
+ * per-tick drain, so the cancelling part would never drain from
+ * either tracker and an exact cancellation would stall both.
+ * Settle that overlap directly between them (no clock motion).
*/
- s64 chunk = (s64)adj * ONE_US_NS + ntpdata->time_adjust_frac;
- s64 max = (s64)MAX_TICKADJ * ONE_US_NS;
+ if (off_chunk && adj_chunk && (off_chunk > 0) != (adj_chunk > 0)) {
+ s64 conflict = min(abs(off_chunk), abs(adj_chunk));
+
+ ntp_transfer_offset_adjust(ntpdata,
+ (off_chunk > 0) ? conflict : -conflict);
+ }

- chunk = clamp(chunk, -max, max);
- /* shifted-ns/second -> per-tick skew_delta: divide by HZ*HZ */
- ntpdata->skew_delta += div_s64(chunk,
- (s64)NTP_INTERVAL_FREQ * NTP_INTERVAL_FREQ);
+ /* Net is what the clock delivers; reduce to per-tick, then floor. */
+ net = off_chunk + adj_chunk;
+ ntpdata->skew_delta = div_s64(net, NTP_INTERVAL_FREQ);
+ if (!ntpdata->skew_delta && net)
+ ntpdata->skew_delta = (net > 0) ? 1 : -1;
+ } else {
+ ntpdata->skew_delta = 0;
}

return leap;
--
2.43.0

Attachment: smime.p7s
Description: S/MIME cryptographic signature