Re: [RFC PATCH v2 2/8] timekeeping: Account for clawback adjustment in ntp_error

From: David Woodhouse

Date: Tue May 19 2026 - 06:11:34 EST


On Mon, 2026-05-18 at 18:59 -0700, John Stultz wrote:
> On Sun, May 17, 2026 at 3:03 PM David Woodhouse <dwmw2@xxxxxxxxxxxxx> wrote:
> >
> > From: David Woodhouse <dwmw@xxxxxxxxxxxx>
> >
> > timekeeping_apply_adjustment() modifies xtime_nsec to maintain vDSO
> > monotonicity when mult changes:
> >
> >     xtime_nsec -= offset
> >
> > This ensures that the time reported to userspace doesn't jump when the
> > multiplier is adjusted. However, ntp_error — which tracks the difference
> > between intended and actual clock position — was not updated to reflect
> > this change.
> >
> > After a mult change, xtime_nsec has moved but ntp_error still reflects
> > the old position. For the normal ±1 dithering this is negligible (the
> > adjustments cancel over time), but for larger mult changes — such as
> > when an external reference clock sets a new frequency — the one-time
> > uncompensated offset is significant (~38ns for a 700-count mult change).
> >
> > Fix by adjusting ntp_error by the same amount:
> >
> >     ntp_error += offset << ntp_error_shift
> >
> > This keeps ntp_error consistent with the actual xtime_nsec position
> > after the clawback.
> >
> > Fixes: 1b1b3e2a3671 ("timekeeping: Rework frequency adjustments to work better w/ nohz")
>
> That doesn't seem to be the right commit. Do you mean dc491596f639 ?

Er yes, that 1b1b commit doesn't even exist. I've been keeping the AI
on a *very* tight rein as I navigate all this, but that one escaped.

> But really, we used to do something like this, but it was removed in
> commit c2cda2a5bda9 ("timekeeping/ntp: Don't align NTP frequency
> adjustments to ticks").  From the difference in the math it looks like
> the previous implementation was maybe adjusting for the next tick
> instead of the previous?

The original subtraction of (interval - offset) actually goes all the
way back to commit 19923c190e093 in 2006 when the error field was first
introduced. I don't know if it was right then, but it certainly looks
like it was wrong in 2018 when commit c2cda2a5bda9 ripped it out.

What I'm adding back is 'ntp_error += offset'. I don't know why
'interval' was ever involved. As you suggest, I do think it does have
the effect of prematurely accounting for the changed xtime_interval of
the *next* tick... which is going to be correctly accounted when it
happens anyway, so it's a double addition.

I've reworked the commit message for the next round:
https://git.infradead.org/?p=users/dwmw2/linux.git;a=commitdiff;h=a1ea3c1bfd


I find my definitions (A) (B) (C) of the absolute time values
relatively simple to understand. We know *exactly* how much each of
them advances per tick. The *deltas* between them, represented by
ntp_error and time_offset, are somewhat harder to track. Tracking
ntp_error, for example, is always "Add what got added to (B), subtract
what got added to (A)". Hence the +ntp_interval, -xtime_interval we
discussed in the other commit which removed xtime_remainder.

I've added some local debugging which tracks those *absolute* values,
with associated sanity checks on the deltas which should precisely
match the difference between them on each tick. That's why I have a
reasonable amount of confidence that these fixes are correct.

> Also, since you're re-adding it, could you add a detailed rationale to
> the comment in timekeeping_apply_adjustment()? (It had long been on my
> todo, but by the time I started adding the commits the details had
> faded and I never got the time to re-derive the math.)

I was hoping not to have to think about that part. The fact here is
that it *does* apply an offset to 'xtime' and thus of course the delta
from xtime(A) to where we ought to be right now (B) has changed by the
same amount.

Calculating *what* that offset should be, is... above my pay grade :)

And I do think it's mostly working, so is there a particular reason you
want me to take a closer look?

Because the moment I start looking at the comment, I see the part which
says
* So given the same offset value, we need the time to be the same
* both before and after the freq adjustment.

... and I come to believe that 'before' the freq adjustment is actually
some point in the *future*; the last counter reading at which a vDSO
*currently* running on another CPU might possibly apply the faster
formula from the previous tick? And then my brain falls out and I have
to sit under the desk rocking back and forth for a while... ?


> Miroslav's review and input here would also be good.

Ack. Thomas had nudged me to add Miroslav to Cc, which
get_maintainers.pl had not.

Attachment: smime.p7s
Description: S/MIME cryptographic signature