Re: locking/csd-lock: Switch from sched_clock() to ktime_get_mono_fast_ns()

From: Peter Zijlstra
Date: Thu Oct 10 2024 - 07:21:45 EST


On Wed, Oct 09, 2024 at 11:18:34AM -0700, Paul E. McKenney wrote:
> On Wed, Oct 09, 2024 at 08:07:08PM +0200, Peter Zijlstra wrote:
> > On Wed, Oct 09, 2024 at 10:57:24AM -0700, Paul E. McKenney wrote:
> > > Currently, the CONFIG_CSD_LOCK_WAIT_DEBUG code uses sched_clock()
> > > to check for excessive CSD-lock wait times. This works, but does not
> > > guarantee monotonic timestamps.
> >
> > It does if you provide a sane TSC
>
> What is this "sane TSC" of which you speak? ;-)
>
> More seriously, the raw reads from the TSC that are carried out by
> sched_clock() are not guaranteed to be monotonic due to potential
> instruction reordering and the like. This is *not* a theoretical
> statement -- we really do see this on the fleet. Very rarely for any
> given system, to be sure, but not at all rare across the full set of them.
>
> This results in false-positive CSD-lock complaints claiming almost 2^64
> nanoseconds of delay, which are not good complaints to have.

Ooh, so the real difference is that clocksource_tsc ends up using
rdtsc_ordered() while sched_clock() ends up using rdtsc(), and you're
actually seeing that reordering happen.

*urgh*.

Yes, please put that in the Changelog.