Re: [patch 01/12] clockevents: Prevent timer interrupt starvation

From: Frederic Weisbecker

Date: Tue Apr 07 2026 - 10:07:58 EST


Le Tue, Apr 07, 2026 at 10:54:17AM +0200, Thomas Gleixner a écrit :
> From: Thomas Gleixner <tglx@xxxxxxxxxx>
>
> Calvin reported an odd NMI watchdog lockup which claims that the CPU locked
> up in user space. He provided a reproducer, which sets up a timerfd based
> timer and then rearms it in a loop with an absolute expiry time of 1ns.
>
> As the expiry time is in the past, the timer ends up as the first expiring
> timer in the per CPU hrtimer base and the clockevent device is programmed
> with the minimum delta value. If the machine is fast enough, this ends up
> in a endless loop of programming the delta value to the minimum value
> defined by the clock event device, before the timer interrupt can fire,
> which starves the interrupt and consequently triggers the lockup detector
> because the hrtimer callback of the lockup mechanism is never invoked.
>
> As a first step to prevent this, avoid reprogramming the clock event device
> when:
> - a forced minimum delta event is pending
> - the new expiry delta is less then or equal to the minimum delta
>
> Thanks to Calvin for providing the reproducer and to Borislav for testing
> and providing data from his Zen5 machine.
>
> The problem is not limited to Zen5, but depending on the underlying
> clock event device (e.g. TSC deadline timer on Intel) and the CPU speed
> not necessarily observable.
>
> This change serves only as the last resort and further changes will be made
> to prevent this scenario earlier in the call chain as far as possible.
>
> Fixes: d316c57ff6bf ("[PATCH] clockevents: add core functionality")
> Reported-by: Calvin Owens <calvin@xxxxxxxxxx>
> Signed-off-by: Thomas Gleixner <tglx@xxxxxxxxxx>
> Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
> Cc: Anna-Maria Behnsen <anna-maria@xxxxxxxxxxxxx>
> Cc: Frederic Weisbecker <frederic@xxxxxxxxxx>
> Cc: Ingo Molnar <mingo@xxxxxxxxxx>
> Link: https://lore.kernel.org/lkml/acMe-QZUel-bBYUh@xxxxxxxxxxxxx/
> ---
> V2: Simplified the clockevents code - Peter

Isn't it possible to rely on dev->next_event instead? In the above scenario,
subsequent 0 delta would not reprogram if dev->next_event is already below
the new call to ktime_get() ?

Thanks.

--
Frederic Weisbecker
SUSE Labs