Re: [PATCH v8 clocksource 2/5] clocksource: Retry clock read if long delays detected

From: Paul E. McKenney
Date: Fri Apr 16 2021 - 20:25:23 EST


On Fri, Apr 16, 2021 at 10:45:28PM +0200, Thomas Gleixner wrote:
> On Tue, Apr 13 2021 at 21:35, Paul E. McKenney wrote:
> > #define WATCHDOG_INTERVAL (HZ >> 1)
> > #define WATCHDOG_THRESHOLD (NSEC_PER_SEC >> 4)
>
> Didn't we discuss that the threshold is too big ?

Indeed we did! How about like this, so that WATCHDOG_INTERVAL is at 500
microseconds and we tolerate up to 125 microseconds of delay during the
timer-read process?

I am firing up overnight tests.

Thanx, Paul

------------------------------------------------------------------------

commit 6c52b5f3cfefd6e429efc4413fd25e3c394e959f
Author: Paul E. McKenney <paulmck@xxxxxxxxxx>
Date: Fri Apr 16 16:19:43 2021 -0700

clocksource: Reduce WATCHDOG_THRESHOLD

Currently, WATCHDOG_THRESHOLD is set to detect a 62.5-millisecond skew in
a 500-millisecond WATCHDOG_INTERVAL. This requires that clocks be skewed
by more than 12.5% in order to be marked unstable. Except that a clock
that is skewed by that much is probably destroying unsuspecting software
right and left. And given that there are now checks for false-positive
skews due to delays between reading the two clocks, and given that current
hardware clocks all increment well in excess of 1MHz, it should be possible
to greatly decrease WATCHDOG_THRESHOLD.

Therefore, decrease WATCHDOG_THRESHOLD from the current 62.5 milliseconds
down to 500 microseconds.

Suggested-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
Cc: John Stultz <john.stultz@xxxxxxxxxx>
Cc: Stephen Boyd <sboyd@xxxxxxxxxx>
Cc: Jonathan Corbet <corbet@xxxxxxx>
Cc: Mark Rutland <Mark.Rutland@xxxxxxx>
Cc: Marc Zyngier <maz@xxxxxxxxxx>
Cc: Andi Kleen <ak@xxxxxxxxxxxxxxx>
[ paulmck: Apply Rik van Riel feedback. ]
Reported-by: Chris Mason <clm@xxxxxx>
Signed-off-by: Paul E. McKenney <paulmck@xxxxxxxxxx>

diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
index 8f4967e59b05..4ec19a13dcf0 100644
--- a/kernel/time/clocksource.c
+++ b/kernel/time/clocksource.c
@@ -125,8 +125,8 @@ static void __clocksource_change_rating(struct clocksource *cs, int rating);
* Interval: 0.5sec Threshold: 0.0625s
*/
#define WATCHDOG_INTERVAL (HZ >> 1)
-#define WATCHDOG_THRESHOLD (NSEC_PER_SEC >> 4)
-#define WATCHDOG_MAX_SKEW (NSEC_PER_SEC >> 6)
+#define WATCHDOG_THRESHOLD (500 * NSEC_PER_USEC)
+#define WATCHDOG_MAX_SKEW (WATCHDOG_THRESHOLD / 4)

static void clocksource_watchdog_work(struct work_struct *work)
{