[RFC v8 25/28] timekeeping: inform clockevents about freq adjustments

From: Nicolai Stange
Date: Sat Nov 19 2016 - 11:12:30 EST


Upon adjustments of the monotonic clock's frequencies from the
timekeeping core, the clockevents devices' ->mult_adjusted should be
changed accordingly, too.

Introduce clockevents_adjust_all_freqs() which traverses all registered
clockevent devices and, if the CLOCK_EVT_FEAT_NO_ADJUST flag is not set,
recalculates their ->mult_adjusted based on the monotonic clock's current
frequency.

Call clockevents_adjust_all_freqs() from timekeeping_freqadjust().

Note that it might look like as if timekeeping_apply_adjustment() was the
more natural candidate to trigger the clockevent devices' frequency updates
from: it's the single place where the mono clock's ->mult is changed.
However, timekeeping_apply_adjustment() is also invoked for the
on-off-controlled adjustments made to the mono clock's ->mult from
timekeeping_adjust(). These adjustments are very small in magnitude and,
more importantly, exhibit some oscillatory behaviour once the NTP error
becomes small. We don't want the clockevent devices' ->mult values to
follow these oscillations because they're negligible and because the
process of updating them would periodically destroy what
clockevents_increase_min_delta() might have built up.

Performance impact:
The following measurements have been carried out on a Raspberry Pi 2B
(armv7, 4 cores, 900MHz). The adjustment process has been driven by
periodically injecting a certain offset via adjtimex(2). The runtime of
clockevents_adjust_all_freqs() has been measured for three workloads.

- CPU stressed system: no interrupts except for the timer interrupts. An
adjtimex(2) every 3.7h keeps the adjustment process running. A
'stress --cpu 8' hogs the four cores.
Mean: 1733.75+-81.10
Quantiles:
0% 25% 50% 75% 100%
1458 1671 1717 1800 2083 (ns)

- Memory stressed system: same setup but with some memory load generated
by 'stress --vm 8 --vm-bytes 32M' instead.
Mean: 8750.73+-1034.77
Quantiles:
0% 25% 50% 75% 100%
3854 8171 8901 9452 13593 (ns)

- Idle case: same setup as above but with no stress at all. The results
are very similar to the ones obtained for the case of CPU stress.

Since clockevents_adjust_all_freqs() runs with interrupts disabled, the
differences in CPU vs. memory loaded are to be explained with cache hits
vs. misses as well as busy memory in the latter case.

Future patches will partly mitigate the memory stressed situation.

Signed-off-by: Nicolai Stange <nicstange@xxxxxxxxx>
---
kernel/time/clockevents.c | 34 ++++++++++++++++++++++++++++++++++
kernel/time/tick-internal.h | 5 +++++
kernel/time/timekeeping.c | 2 ++
3 files changed, 41 insertions(+)

diff --git a/kernel/time/clockevents.c b/kernel/time/clockevents.c
index 7dd71bb..6146d2f 100644
--- a/kernel/time/clockevents.c
+++ b/kernel/time/clockevents.c
@@ -626,6 +626,40 @@ void __clockevents_adjust_freq(struct clock_event_device *dev)
mult_cs_raw);
}

+void clockevents_adjust_all_freqs(u32 mult_cs_mono, u32 mult_cs_raw)
+{
+ u32 last_mult_raw = 0, last_shift = 0, last_mult_adjusted = 0;
+ u32 mult_raw, shift;
+ unsigned long flags;
+ struct clock_event_device *dev;
+
+ raw_spin_lock_irqsave(&clockevents_lock, flags);
+ list_for_each_entry(dev, &clockevent_devices, list) {
+ if (!(dev->features & CLOCK_EVT_FEAT_ONESHOT) ||
+ (dev->features & CLOCK_EVT_FEAT_DUMMY) ||
+ (dev->features & CLOCK_EVT_FEAT_NO_ADJUST))
+ continue;
+
+ /*
+ * The cached last_mult_adjusted is only valid if
+ * shift == last_shift. Otherwise, it could exceed
+ * what is allowed by ->max_delta_ns.
+ */
+ mult_raw = dev->mult;
+ shift = dev->shift;
+ if (mult_raw != last_mult_raw || shift != last_shift) {
+ last_mult_raw = mult_raw;
+ last_shift = shift;
+ last_mult_adjusted =
+ __clockevents_calc_adjust_freq(mult_raw,
+ mult_cs_mono,
+ mult_cs_raw);
+ }
+ dev->mult_adjusted = last_mult_adjusted;
+ }
+ raw_spin_unlock_irqrestore(&clockevents_lock, flags);
+}
+
int __clockevents_update_freq(struct clock_event_device *dev, u32 freq)
{
clockevents_config(dev, freq);
diff --git a/kernel/time/tick-internal.h b/kernel/time/tick-internal.h
index 0b29d23..2d97c42 100644
--- a/kernel/time/tick-internal.h
+++ b/kernel/time/tick-internal.h
@@ -56,6 +56,7 @@ extern int clockevents_program_event(struct clock_event_device *dev,
ktime_t expires, bool force);
extern void clockevents_handle_noop(struct clock_event_device *dev);
extern int __clockevents_update_freq(struct clock_event_device *dev, u32 freq);
+extern void clockevents_adjust_all_freqs(u32 mult_cs_mono, u32 mult_cs_raw);
extern void timekeeping_get_mono_mult(u32 *mult_cs_mono, u32 *mult_cs_raw);
extern ssize_t sysfs_get_uname(const char *buf, char *dst, size_t cnt);

@@ -95,6 +96,10 @@ static inline void tick_set_periodic_handler(struct clock_event_device *dev, int
#else /* !GENERIC_CLOCKEVENTS: */
static inline void tick_suspend(void) { }
static inline void tick_resume(void) { }
+
+static inline void clockevents_adjust_all_freqs(u32 mult_cs_mono,
+ u32 mult_cs_raw)
+{}
#endif /* !GENERIC_CLOCKEVENTS */

/* Oneshot related functions */
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index e0471e0..3d0ebc3 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -1913,6 +1913,8 @@ static __always_inline void timekeeping_freqadjust(struct timekeeper *tk,

/* scale the corrections */
timekeeping_apply_adjustment(tk, offset, negative, adj_scale);
+ clockevents_adjust_all_freqs(tk->tkr_mono.mult,
+ tk->tkr_mono.clock->mult);
}

/*
--
2.10.2