Re: [RFC][PATCH 2/4] time: Fix CLOCK_MONOTONIC_RAW sub-nanosecond accounting

From: Ingo Molnar
Date: Sat May 27 2017 - 03:36:17 EST



* John Stultz <john.stultz@xxxxxxxxxx> wrote:

> Due to how the MONOTONIC_RAW accumulation logic was handled,
> there is the potential for a 1ns discontinuity when we do
> accumulations. This small discontinuity has for the most part
> gone un-noticed, but since ARM64 enabled CLOCK_MONOTONIC_RAW
> in their vDSO clock_gettime implementation, we've seen failures
> with the inconsistency-check test in kselftest.
>
> This patch addresses the issue by using the same sub-ns
> accumulation handling that CLOCK_MONOTONIC uses, which avoids
> the issue for in-kernel users.
>
> Since the ARM64 vDSO implementation has its own clock_gettime
> calculation logic, this patch reduces the frequency of errors,
> but failures are still seen. The ARM64 vDSO will need to be
> updated to include the sub-nanosecond xtime_nsec values in its
> calculation for this issue to be completely fixed.
>
> Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> Cc: Ingo Molnar <mingo@xxxxxxxxxx>
> Cc: Miroslav Lichvar <mlichvar@xxxxxxxxxx>
> Cc: Richard Cochran <richardcochran@xxxxxxxxx>
> Cc: Prarit Bhargava <prarit@xxxxxxxxxx>
> Cc: Stephen Boyd <stephen.boyd@xxxxxxxxxx>
> Cc: Kevin Brodsky <kevin.brodsky@xxxxxxx>
> Cc: Will Deacon <will.deacon@xxxxxxx>
> Cc: Daniel Mentz <danielmentz@xxxxxxxxxx>
> Signed-off-by: John Stultz <john.stultz@xxxxxxxxxx>
> ---
> include/linux/timekeeper_internal.h | 4 ++--
> kernel/time/timekeeping.c | 21 ++++++++++++---------
> 2 files changed, 14 insertions(+), 11 deletions(-)
>
> diff --git a/include/linux/timekeeper_internal.h b/include/linux/timekeeper_internal.h
> index 110f453..528cc86 100644
> --- a/include/linux/timekeeper_internal.h
> +++ b/include/linux/timekeeper_internal.h
> @@ -58,7 +58,7 @@ struct tk_read_base {
> * interval.
> * @xtime_remainder: Shifted nano seconds left over when rounding
> * @cycle_interval
> - * @raw_interval: Raw nano seconds accumulated per NTP interval.
> + * @raw_interval: Shifted raw nano seconds accumulated per NTP interval.
> * @ntp_error: Difference between accumulated time and NTP time in ntp
> * shifted nano seconds.
> * @ntp_error_shift: Shift conversion between clock shifted nano seconds and
> @@ -100,7 +100,7 @@ struct timekeeper {
> u64 cycle_interval;
> u64 xtime_interval;
> s64 xtime_remainder;
> - u32 raw_interval;
> + u64 raw_interval;
> /* The ntp_tick_length() value currently being used.
> * This cached copy ensures we consistently apply the tick
> * length for an entire tick, as ntp_tick_length may change
> diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
> index abc1968..35d3ba3 100644
> --- a/kernel/time/timekeeping.c
> +++ b/kernel/time/timekeeping.c
> @@ -282,7 +282,7 @@ static void tk_setup_internals(struct timekeeper *tk, struct clocksource *clock)
> /* Go back from cycles -> shifted ns */
> tk->xtime_interval = interval * clock->mult;
> tk->xtime_remainder = ntpinterval - tk->xtime_interval;
> - tk->raw_interval = (interval * clock->mult) >> clock->shift;
> + tk->raw_interval = interval * clock->mult;
>
> /* if changing clocks, convert xtime_nsec shift units */
> if (old_clock) {
> @@ -1994,7 +1994,7 @@ static u64 logarithmic_accumulation(struct timekeeper *tk, u64 offset,
> u32 shift, unsigned int *clock_set)
> {
> u64 interval = tk->cycle_interval << shift;
> - u64 raw_nsecs;
> + u64 nsecps;

What does the 'ps' postfix stand for? It's not obvious (to me).

> + tk->tkr_raw.xtime_nsec += (u64)tk->raw_time.tv_nsec
> + << tk->tkr_raw.shift;
> + tk->tkr_raw.xtime_nsec -= (u64)tk->raw_time.tv_nsec
> + << tk->tkr_raw.shift;

Please don't break the line in such an ugly, random way, just write:

tk->tkr_raw.xtime_nsec += (u64)tk->raw_time.tv_nsec << tk->tkr_raw.shift;
tk->tkr_raw.xtime_nsec -= (u64)tk->raw_time.tv_nsec << tk->tkr_raw.shift;

Thanks,

Ingo