[PATCH 05/12] time: Add debugging checks to warn if we see delays

From: John Stultz
Date: Fri Mar 06 2015 - 21:50:37 EST


Recently there's been some request for better sanity
checking in the time code, so that its more clear
when something is going wrong since timekeeping issues
could manifest in a large number of strange ways with
various subsystems.

Thus, this patch adds some extra infrastructure to
add a check update_wall_time to print warnings if we
see the call delayed beyond the max_cycles overflow
point, or beyond the clocksource max_idle_ns value
which is currently 50% of the overflow point.

This extra infrastructure is conditionalized
behind a new CONFIG_DEBUG_TIMEKEEPING option
also added in this patch.

Tested this a bit by halting qemu for specified
lengths of time to trigger the warnings.

Cc: Dave Jones <davej@xxxxxxxxxxxxxxxxx>
Cc: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
Cc: Richard Cochran <richardcochran@xxxxxxxxx>
Cc: Prarit Bhargava <prarit@xxxxxxxxxx>
Cc: Stephen Boyd <sboyd@xxxxxxxxxxxxxx>
Cc: Ingo Molnar <mingo@xxxxxxxxxx>
Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Signed-off-by: John Stultz <john.stultz@xxxxxxxxxx>
---
kernel/time/jiffies.c | 1 +
kernel/time/timekeeping.c | 26 ++++++++++++++++++++++++++
lib/Kconfig.debug | 12 ++++++++++++
3 files changed, 39 insertions(+)

diff --git a/kernel/time/jiffies.c b/kernel/time/jiffies.c
index a6a5bf5..7e41390 100644
--- a/kernel/time/jiffies.c
+++ b/kernel/time/jiffies.c
@@ -71,6 +71,7 @@ static struct clocksource clocksource_jiffies = {
.mask = 0xffffffff, /*32bits*/
.mult = NSEC_PER_JIFFY << JIFFIES_SHIFT, /* details above */
.shift = JIFFIES_SHIFT,
+ .max_cycles = 10,
};

__cacheline_aligned_in_smp DEFINE_SEQLOCK(jiffies_lock);
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index 91db941..7e9d433 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -118,6 +118,29 @@ static inline void tk_update_sleep_time(struct timekeeper *tk, ktime_t delta)
tk->offs_boot = ktime_add(tk->offs_boot, delta);
}

+#ifdef CONFIG_DEBUG_TIMEKEEPING
+static void timekeeping_check_update(struct timekeeper *tk, cycle_t offset)
+{
+
+ cycle_t max_cycles = tk->tkr.clock->max_cycles;
+ const char *name = tk->tkr.clock->name;
+
+ if (offset > max_cycles)
+ printk_deferred("ERROR: cycle offset (%lld) is larger then"
+ " allowed %s max_cycles (%lld)\n",
+ offset, name, max_cycles);
+ else if (offset > (max_cycles >> 1))
+ printk_deferred("WARNING: cycle offset (%lld) is past"
+ " the %s 50%% safety margin (%lld)\n",
+ offset, name, max_cycles>>1);
+}
+#else
+static inline
+void timekeeping_check_update(struct timekeeper *tk, cycle_t offset)
+{
+}
+#endif
+
/**
* tk_setup_internals - Set up internals to use clocksource clock.
*
@@ -1630,6 +1653,9 @@ void update_wall_time(void)
if (offset < real_tk->cycle_interval)
goto out;

+ /* Do some additional sanity checking */
+ timekeeping_check_update(real_tk, offset);
+
/*
* With NO_HZ we may have to accumulate many cycle_intervals
* (think "ticks") worth of time at once. To do this efficiently,
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index c5cefb3..32065f6 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -865,6 +865,18 @@ config SCHED_STACK_END_CHECK
data corruption or a sporadic crash at a later stage once the region
is examined. The runtime overhead introduced is minimal.

+config DEBUG_TIMEKEEPING
+ bool "Enable extra timekeeping sanity checking"
+ help
+ This option will enable additional timekeeping sanity checks
+ which may be helpful when diagnoising issues where timekeeping
+ problems are suspected.
+
+ This may include checks in the timekeeping hotpaths, so this
+ option may have a performance impact to some workloads.
+
+ If unsure, say N.
+
config TIMER_STATS
bool "Collect kernel timers statistics"
depends on DEBUG_KERNEL && PROC_FS
--
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/