Re: [PATCH v2] clocksource: Defer marking clocksources unstable to kthread

From: Paul E. McKenney
Date: Sun Mar 09 2025 - 11:58:57 EST


On Sun, Mar 09, 2025 at 07:36:05PM +0800, Yafang Shao wrote:
> On Sun, Mar 9, 2025 at 12:38 AM Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
> >
> > On Thu, Mar 06 2025 at 08:06, Paul E. McKenney wrote:
> > > The clocksource watchdog marks clocksources unstable from within a timer
> > > handler. On x86, this marking involves an on_each_cpu_cond_mask(),
> > > which in turn invokes smp_call_function_many_cond(), which may not be
> > > invoked from a timer handler. Doing so results in:
> > >
> > > WARNING: CPU: 3 PID: 0 at kernel/smp.c:815 smp_call_function_many_cond+0x46b/0x4c0
> > >
> > > Fix this by deferring the marking to the clocksource watchdog kthread.
> > > Note that marking unstable is already deferred, so deferring it a bit
> > > more should be just fine.
> >
> > While this can be done, that's papering over the underlying problem,
> > which was introduced with:
> >
> > 8722903cbb8f ("sched: Define sched_clock_irqtime as static key")
> >
> > That added the static key switch, which is causing the problem. And
> > "fixing" this in the clocksource watchdog is incomplete because the same
> > problem exists during CPU hotplug when the TSC synchronization declares
> > the TSC unstable. It's the exactly same problem as was fixed via:
> >
> > 6577e42a3e16 ("sched/clock: Fix up clear_sched_clock_stable()")
> >
> > So as this got introduced in the 6.14 merge window, the proper fix is to
> > revert commit 8722903cbb8f and send it back to the drawing board. It was
> > clearly never tested with the various possibilities which invoke
> > mark_tsc*_unstable().
>
> Hello Thomas,
>
> It has been reverted by the following commit
> b9f2b29b9494 ("sched: Don't define sched_clock_irqtime as static key")
>
> https://web.git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/commit/?id=b9f2b29b94943b08157e3dfc970baabc7944dbc3

Thank you! I will drop my commit on my next rebase.

Thanx, Paul