Re: [RFC PATCH 1/2] time/timekeeping: Fix possible inconsistencies in _COARSE clockids

From: John Stultz
Date: Thu Mar 20 2025 - 14:02:19 EST


On Sun, Mar 16, 2025 at 9:56 PM Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
> On Sat, Mar 15 2025 at 16:22, John Stultz wrote:
> > On Sat, Mar 15, 2025 at 12:23 PM Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
> >> > So to fix this, rework the timekeeping_advance() logic a bit
> >> > so that when we are called from do_adjtimex() and the offset
> >> > is smaller then cycle_interval, that we call
> >> > timekeeping_forward(), to first accumulate the sub-interval
> >> > time into xtime_nsec. Then with no unaccumulated cycles in
> >> > offset, we can do the mult adjustment without worry of the
> >> > subtraction having an impact.
> >>
> >> It's a smart solution. I briefly pondered something similar, but I'm not
> >> really fond of the fact, that it causes a clock_was_set() event for no
> >> good reason.
> >>
> >> clock_was_set() means that there is a time jump. But that's absolutely
> >> not the case with do_adjtimex() changing the frequency for quick
> >> adjustments. That does not affect continuity at all.
> >
> > Oh, good point. I wasn't thinking clearly about the semantics, and
> > was being a little paranoid there (as most calls to
> > timekeeping_forward_now() have clock_was_set() calls that follow). I
> > suspect I can do away with that bit and avoid it. I'll respin later
> > this week.
>
> Actually your patch is not even emitting a clock_was_set() event:
>
> > + if (offset < real_tk->cycle_interval) {
> > + timekeeping_forward(tk, now);
> > + *clock_set = 1;
> > + return 0;
> > + }
>
> #define TK_CLEAR_NTP (1 << 0)
> #define TK_CLOCK_WAS_SET (1 << 1)
>
> So it clears NTP instead. Not really what you want either. :)

Hey Thomas,
Sorry for the slow reply here. So I agree with you that we don't
want to set clock_set above, that was my mistake. But this last bit I
don't think is right, as timekeeping advance() just returns a bool
(return !!clock_set;), which is used to decide to call clock_was_set()
or not - not the argument passed to clock_was_set().


> But yes, it simply can forward the clock without side effects.
>
> I think that this should be done for all TICK_ADV_FREQ adjustments. In
> case of such asynchronous adjustments it does not make any sense to take
> the old ntp_error value into account and accumlate some more. In fact
> this simply should clear ntp_error so the new value goes into effect as
> provided by NTP and not skewed by ntp_error.
>
> These async adjustments (usually very small ones) happen when the
> current source degrades and chronyd/ntpd switches over to a new server.
>
> Something like the below.

So I finally got a chance to look at the diff between your change and
mine, and your changes look good to me. Thanks again for catching the
clock_set thinko, and I agree clearing ntp_error looks like the right
thing to do. I'm going to do some testing here with them and resubmit
shortly.

thanks
-john