[tip:timers/core] clocksource: Simplify the logic around clocksource wrapping safety margins

From: tip-bot for John Stultz
Date: Fri Mar 13 2015 - 05:02:37 EST


Commit-ID: 362fde0410377e468ca00ad363fdf3e3ec42eb6a
Gitweb: http://git.kernel.org/tip/362fde0410377e468ca00ad363fdf3e3ec42eb6a
Author: John Stultz <john.stultz@xxxxxxxxxx>
AuthorDate: Wed, 11 Mar 2015 21:16:30 -0700
Committer: Ingo Molnar <mingo@xxxxxxxxxx>
CommitDate: Thu, 12 Mar 2015 10:16:38 +0100

clocksource: Simplify the logic around clocksource wrapping safety margins

The clocksource logic has a number of places where we try to
include a safety margin. Most of these are 12% safety margins,
but they are inconsistently applied and sometimes are applied
on top of each other.

Additionally, in the previous patch, we corrected an issue
where we unintentionally in effect created a 50% safety margin,
which these 12.5% margins where then added to.

So to simplify the logic here, this patch removes the various
12.5% margins, and consolidates adding the margin in one place:
clocks_calc_max_nsecs().

Additionally, Linus prefers a 50% safety margin, as it allows
bad clock values to be more easily caught. This should really
have no net effect, due to the corrected issue earlier which
caused greater then 50% margins to be used w/o issue.

Signed-off-by: John Stultz <john.stultz@xxxxxxxxxx>
Acked-by: Stephen Boyd <sboyd@xxxxxxxxxxxxxx> (for the sched_clock.c bit)
Cc: Dave Jones <davej@xxxxxxxxxxxxxxxxx>
Cc: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Cc: Prarit Bhargava <prarit@xxxxxxxxxx>
Cc: Richard Cochran <richardcochran@xxxxxxxxx>
Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
Link: http://lkml.kernel.org/r/1426133800-29329-3-git-send-email-john.stultz@xxxxxxxxxx
Signed-off-by: Ingo Molnar <mingo@xxxxxxxxxx>
---
kernel/time/clocksource.c | 26 ++++++++++++--------------
kernel/time/sched_clock.c | 4 ++--
2 files changed, 14 insertions(+), 16 deletions(-)

diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
index 2148f41..ace9576 100644
--- a/kernel/time/clocksource.c
+++ b/kernel/time/clocksource.c
@@ -469,6 +469,9 @@ static u32 clocksource_max_adjustment(struct clocksource *cs)
* @shift: cycle to nanosecond divisor (power of two)
* @maxadj: maximum adjustment value to mult (~11%)
* @mask: bitmask for two's complement subtraction of non 64 bit counters
+ *
+ * NOTE: This function includes a safety margin of 50%, so that bad clock values
+ * can be detected.
*/
u64 clocks_calc_max_nsecs(u32 mult, u32 shift, u32 maxadj, u64 mask)
{
@@ -490,11 +493,14 @@ u64 clocks_calc_max_nsecs(u32 mult, u32 shift, u32 maxadj, u64 mask)
max_cycles = min(max_cycles, mask);
max_nsecs = clocksource_cyc2ns(max_cycles, mult - maxadj, shift);

+ /* Return 50% of the actual maximum, so we can detect bad values */
+ max_nsecs >>= 1;
+
return max_nsecs;
}

/**
- * clocksource_max_deferment - Returns max time the clocksource can be deferred
+ * clocksource_max_deferment - Returns max time the clocksource should be deferred
* @cs: Pointer to clocksource
*
*/
@@ -504,13 +510,7 @@ static u64 clocksource_max_deferment(struct clocksource *cs)

max_nsecs = clocks_calc_max_nsecs(cs->mult, cs->shift, cs->maxadj,
cs->mask);
- /*
- * To ensure that the clocksource does not wrap whilst we are idle,
- * limit the time the clocksource can be deferred by 12.5%. Please
- * note a margin of 12.5% is used because this can be computed with
- * a shift, versus say 10% which would require division.
- */
- return max_nsecs - (max_nsecs >> 3);
+ return max_nsecs;
}

#ifndef CONFIG_ARCH_USES_GETTIMEOFFSET
@@ -659,10 +659,9 @@ void __clocksource_updatefreq_scale(struct clocksource *cs, u32 scale, u32 freq)
* conversion precision. 10 minutes is still a reasonable
* amount. That results in a shift value of 24 for a
* clocksource with mask >= 40bit and f >= 4GHz. That maps to
- * ~ 0.06ppm granularity for NTP. We apply the same 12.5%
- * margin as we do in clocksource_max_deferment()
+ * ~ 0.06ppm granularity for NTP.
*/
- sec = (cs->mask - (cs->mask >> 3));
+ sec = cs->mask;
do_div(sec, freq);
do_div(sec, scale);
if (!sec)
@@ -674,9 +673,8 @@ void __clocksource_updatefreq_scale(struct clocksource *cs, u32 scale, u32 freq)
NSEC_PER_SEC / scale, sec * scale);

/*
- * for clocksources that have large mults, to avoid overflow.
- * Since mult may be adjusted by ntp, add an safety extra margin
- *
+ * Ensure clocksources that have large 'mult' values don't overflow
+ * when adjusted.
*/
cs->maxadj = clocksource_max_adjustment(cs);
while ((cs->mult + cs->maxadj < cs->mult)
diff --git a/kernel/time/sched_clock.c b/kernel/time/sched_clock.c
index 01d2d15..3b8ae45 100644
--- a/kernel/time/sched_clock.c
+++ b/kernel/time/sched_clock.c
@@ -125,9 +125,9 @@ void __init sched_clock_register(u64 (*read)(void), int bits,

new_mask = CLOCKSOURCE_MASK(bits);

- /* calculate how many ns until we wrap */
+ /* calculate how many nanosecs until we risk wrapping */
wrap = clocks_calc_max_nsecs(new_mult, new_shift, 0, new_mask);
- new_wrap_kt = ns_to_ktime(wrap - (wrap >> 3));
+ new_wrap_kt = ns_to_ktime(wrap);

/* update epoch for new counter and update epoch_ns from old counter*/
new_epoch = read();
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/