Re: locking/csd-lock: Switch from sched_clock() to ktime_get_mono_fast_ns()

From: Paul E. McKenney
Date: Thu Oct 10 2024 - 10:21:29 EST


On Thu, Oct 10, 2024 at 01:21:32PM +0200, Peter Zijlstra wrote:
> On Wed, Oct 09, 2024 at 11:18:34AM -0700, Paul E. McKenney wrote:
> > On Wed, Oct 09, 2024 at 08:07:08PM +0200, Peter Zijlstra wrote:
> > > On Wed, Oct 09, 2024 at 10:57:24AM -0700, Paul E. McKenney wrote:
> > > > Currently, the CONFIG_CSD_LOCK_WAIT_DEBUG code uses sched_clock()
> > > > to check for excessive CSD-lock wait times. This works, but does not
> > > > guarantee monotonic timestamps.
> > >
> > > It does if you provide a sane TSC
> >
> > What is this "sane TSC" of which you speak? ;-)
> >
> > More seriously, the raw reads from the TSC that are carried out by
> > sched_clock() are not guaranteed to be monotonic due to potential
> > instruction reordering and the like. This is *not* a theoretical
> > statement -- we really do see this on the fleet. Very rarely for any
> > given system, to be sure, but not at all rare across the full set of them.
> >
> > This results in false-positive CSD-lock complaints claiming almost 2^64
> > nanoseconds of delay, which are not good complaints to have.
>
> Ooh, so the real difference is that clocksource_tsc ends up using
> rdtsc_ordered() while sched_clock() ends up using rdtsc(), and you're
> actually seeing that reordering happen.

You got it!

> *urgh*.
>
> Yes, please put that in the Changelog.

I will do so on my next rebase. And thank you for looking this over!

Thanx, Paul