Re: [PATCH v10 clocksource 3/7] clocksource: Check per-CPU clock synchronization when marked unstable

From: Andi Kleen
Date: Mon Apr 26 2021 - 00:12:39 EST

On Sun, Apr 25, 2021 at 03:47:04PM -0700, Paul E. McKenney wrote:
> Some sorts of per-CPU clock sources have a history of going out of
> synchronization with each other. However, this problem has purportedy
> been solved in the past ten years. Except that it is all too possible
> that the problem has instead simply been made less likely, which might
> mean that some of the occasional "Marking clocksource 'tsc' as unstable"
> messages might be due to desynchronization. How would anyone know?
> Therefore apply CPU-to-CPU synchronization checking to newly unstable
> clocksource that are marked with the new CLOCK_SOURCE_VERIFY_PERCPU flag.
> Lists of desynchronized CPUs are printed, with the caveat that if it
> is the reporting CPU that is itself desynchronized, it will appear that
> all the other clocks are wrong. Just like in real life.

Well I could see this causing a gigantic flood of messages then.
Assume I have 300 cores, do I get all those messages 300 times repeated
then? If the console is slow this might end up taking a lot
of CPU time.

And in a larger cluster this might not be uncommon.

There must be some way to throttle this.