[PATCH v2 23/25] timekeeping: Rework do_adjtimex() to use shadow_timekeeper

From: Anna-Maria Behnsen
Date: Wed Oct 09 2024 - 04:35:12 EST


From: Anna-Maria Behnsen <anna-maria@xxxxxxxxxxxxx>

Updates of the timekeeper can be done by operating on the shadow timekeeper
and afterwards copying the result into the real timekeeper. This has the
advantage, that the sequence count write protected region is kept as small
as possible.

Convert do_adjtimex() to use this scheme and take the opportunity to use a
scoped_guard() for locking.

That requires to have a separate function for updating the leap state so
that the update is protected by the sequence count. This also brings the
timekeeper and the shadow timekeeper in sync for this state, which was not
the case so far. That's not a correctness problem as the state is only used
at the read sides which use the real timekeeper, but it's inconsistent
nevertheless.

Signed-off-by: Anna-Maria Behnsen <anna-maria@xxxxxxxxxxxxx>
---
kernel/time/timekeeping.c | 40 ++++++++++++++++++++++++----------------
1 file changed, 24 insertions(+), 16 deletions(-)

diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index e15e843ba2b8..b2683a589470 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -722,6 +722,18 @@ static inline void tk_update_leap_state(struct timekeeper *tk)
tk->next_leap_ktime = ktime_sub(tk->next_leap_ktime, tk->offs_real);
}

+/*
+ * Leap state update for both shadow and the real timekeeper
+ * Separate to spare a full memcpy() of the timekeeper.
+ */
+static void tk_update_leap_state_all(struct tk_data *tkd)
+{
+ write_seqcount_begin(&tkd->seq);
+ tk_update_leap_state(&tkd->shadow_timekeeper);
+ tkd->timekeeper.next_leap_ktime = tkd->shadow_timekeeper.next_leap_ktime;
+ write_seqcount_end(&tkd->seq);
+}
+
/*
* Update the ktime_t based scalar nsec members of the timekeeper
*/
@@ -2537,13 +2549,11 @@ EXPORT_SYMBOL_GPL(random_get_entropy_fallback);
*/
int do_adjtimex(struct __kernel_timex *txc)
{
- struct timekeeper *tk = &tk_core.timekeeper;
+ struct timekeeper *tk = &tk_core.shadow_timekeeper;
struct audit_ntp_data ad;
bool offset_set = false;
bool clock_set = false;
struct timespec64 ts;
- unsigned long flags;
- s32 orig_tai, tai;
int ret;

/* Validate the data before disabling interrupts */
@@ -2571,23 +2581,21 @@ int do_adjtimex(struct __kernel_timex *txc)
ktime_get_real_ts64(&ts);
add_device_randomness(&ts, sizeof(ts));

- raw_spin_lock_irqsave(&tk_core.lock, flags);
- write_seqcount_begin(&tk_core.seq);
+ scoped_guard (raw_spinlock_irqsave, &tk_core.lock) {
+ s32 orig_tai, tai;

- orig_tai = tai = tk->tai_offset;
- ret = __do_adjtimex(txc, &ts, &tai, &ad);
+ orig_tai = tai = tk->tai_offset;
+ ret = __do_adjtimex(txc, &ts, &tai, &ad);

- if (tai != orig_tai) {
- __timekeeping_set_tai_offset(tk, tai);
- timekeeping_update(&tk_core, tk, TK_MIRROR | TK_CLOCK_WAS_SET);
- clock_set = true;
- } else {
- tk_update_leap_state(tk);
+ if (tai != orig_tai) {
+ __timekeeping_set_tai_offset(tk, tai);
+ timekeeping_update_staged(&tk_core, TK_CLOCK_WAS_SET);
+ clock_set = true;
+ } else {
+ tk_update_leap_state_all(&tk_core);
+ }
}

- write_seqcount_end(&tk_core.seq);
- raw_spin_unlock_irqrestore(&tk_core.lock, flags);
-
audit_ntp_log(&ad);

/* Update the multiplier immediately if frequency was set directly */

--
2.39.5