[PATCH] Fix rolling back of CLOCK_MONOTONIC_COARSE

From: Lei Chen
Date: Sun Mar 09 2025 - 23:00:28 EST


timekeeping_apply_adjustment try to adjust tkr_mono.mult by @mult_adj.
If @mult_adj > 0, tk->tkr_mono.xtime_nsec will be decreased by @offset.
Then timekeeping_update flushes shadow_timekeeper to real tk and vdso
data region. Then rolling back happens.

The drawing below illustrates the reason why timekeeping_apply_adjustment
descreases tk->tkr_mono.xtime_nsec.

cycle_interval offset clock_delta
x-----------------------x----------x---------------------------x

P0 P1 P2 P3

N(P) means the nano sec count at the point P.

Assume timekeeping_apply_adjustment runs at P1, with unaccumulated
cycles @offset. Then tkr_mono.mult is adjusted from M1 to M2.

Since offset happens before tkr_mono.mult adjustment, so we want to
achieve:
N(P3) == offset * M1 + clock_delta * M2 + N(P1) -------- (1)

But at P3, the code works as following:
N(P3) := (offset + clock_delta) * M2 + N(P1)
= offset * M2 + clock_delta * M2 + N(P1)

Apprently, N(P3) goes away from equation (1). To correct it, N(P1)
should be adjusted at P2:
N(P1) -= offset * (M2 - M1)

To fix this issue, the patch accumulates offset into tk, and export
N(P2) to real tk and vdso.

tk.tkr_mono := N(P2) = N(P1) + offset * M1

Then at P3, we calculate N(P3) based on N(P2) instead of N(P1):
N(P3) := N(P2) + clock_delta * M2

Signed-off-by: Lei Chen <lei.chen@xxxxxxxxxx>
---
kernel/time/timekeeping.c | 44 +++++++++------------------------------
1 file changed, 10 insertions(+), 34 deletions(-)

diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index 1e67d076f195..65647f7bbc69 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -1934,15 +1934,14 @@ static __always_inline void timekeeping_apply_adjustment(struct timekeeper *tk,
{
s64 interval = tk->cycle_interval;

- if (mult_adj == 0) {
+ if (mult_adj == 0)
return;
- } else if (mult_adj == -1) {
+
+ if (mult_adj == -1)
interval = -interval;
- offset = -offset;
- } else if (mult_adj != 1) {
+ else if (mult_adj != 1)
interval *= mult_adj;
- offset *= mult_adj;
- }
+

/*
* So the following can be confusing.
@@ -1963,33 +1962,8 @@ static __always_inline void timekeeping_apply_adjustment(struct timekeeper *tk,
* Which can be shortened to:
* xtime_interval += cycle_interval
*
- * So offset stores the non-accumulated cycles. Thus the current
- * time (in shifted nanoseconds) is:
- * now = (offset * adj) + xtime_nsec
- * Now, even though we're adjusting the clock frequency, we have
- * to keep time consistent. In other words, we can't jump back
- * in time, and we also want to avoid jumping forward in time.
- *
- * So given the same offset value, we need the time to be the same
- * both before and after the freq adjustment.
- * now = (offset * adj_1) + xtime_nsec_1
- * now = (offset * adj_2) + xtime_nsec_2
- * So:
- * (offset * adj_1) + xtime_nsec_1 =
- * (offset * adj_2) + xtime_nsec_2
- * And we know:
- * adj_2 = adj_1 + 1
- * So:
- * (offset * adj_1) + xtime_nsec_1 =
- * (offset * (adj_1+1)) + xtime_nsec_2
- * (offset * adj_1) + xtime_nsec_1 =
- * (offset * adj_1) + offset + xtime_nsec_2
- * Canceling the sides:
- * xtime_nsec_1 = offset + xtime_nsec_2
- * Which gives us:
- * xtime_nsec_2 = xtime_nsec_1 - offset
- * Which simplifies to:
- * xtime_nsec -= offset
++ * Since mult will be changed, @offset should be accumulated with original
++ * mult value
*/
if ((mult_adj > 0) && (tk->tkr_mono.mult + mult_adj < mult_adj)) {
/* NTP adjustment caused clocksource mult overflow */
@@ -1997,9 +1971,11 @@ static __always_inline void timekeeping_apply_adjustment(struct timekeeper *tk,
return;
}

+ tk->tkr_mono.xtime_nsec += offset * tk->tkr_mono.mult;
+ tk->tkr_mono.cycle_last += offset;
+
tk->tkr_mono.mult += mult_adj;
tk->xtime_interval += interval;
- tk->tkr_mono.xtime_nsec -= offset;
}

/*
--
2.44.0