Re: [tip:sched/hrtick] [hrtimer] 2889243848: stress-ng.timermix.ops_per_sec 30.1% regression

From: Peter Zijlstra

Date: Wed Mar 11 2026 - 08:15:11 EST


On Wed, Mar 11, 2026 at 11:58:19AM +0100, Peter Zijlstra wrote:

> > Hmm. The original code preserved hang_detected until the next timer
> > interrupt to prevent rearming when a new timer is queued.
>
> Oh indeed. And that avoids __hrtimer_reprogram() from coming in and
> 'destroying' the delay I suppose.
>
> Let me poke at this a little more then.

How's this then?

---
Subject: hrtimer: Less agressive interrupt 'hang' handling
From: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Date: Tue, 10 Mar 2026 20:02:21 +0100

When the hrtimer_interrupt needs to restart more than 3 times and
still has expired timers, the interrupt is considered hung. To give
the system a little time to recover, the hardware timer is programmed
a little into the future.

Prior to commit 288924384856 ("hrtimer: Re-arrange
hrtimer_interrupt()"), this was relative to the amount of time spend
serving the interrupt with a max of 100 msec.

However, in order to simplify, and because this condition 'should' not
happen, the timeout was unconditionally set to 100 msec.

'Obviously' there is a benchmark that hits this hard, by programming a
ton of very short timers :-/

Since reprogramming is decoupled from the interrupt handling, the
actual execution time is lost, however the code does track
max_hang_time. Using that, rather than the 100 ms max restores
performance.

stress-ng --timeout 60 --times --verify --metrics --no-rand-seed --timermix 64

bogo ops/s
288924384856^1: 23715979.93
288924384856: 11550049.77
patched: 23361116.78

Additionally, Thomas noted that we should not clear ->hang_detected
until the next interrupt, such that __hrtimer_reprogram() won't undo
the extra delay.

Fixes: 288924384856 ("hrtimer: Re-arrange hrtimer_interrupt()")
Closes: https://lore.kernel.org/oe-lkp/202603102229.74b9dee4-lkp@xxxxxxxxx
Reported-by: kernel test robot <oliver.sang@xxxxxxxxx>
Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
---
kernel/time/hrtimer.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)

--- a/kernel/time/hrtimer.c
+++ b/kernel/time/hrtimer.c
@@ -2031,8 +2031,8 @@ static void hrtimer_rearm(struct hrtimer
* Give the system a chance to do something else than looping
* on hrtimer interrupts.
*/
- expires_next = ktime_add_ns(ktime_get(), 100 * NSEC_PER_MSEC);
- cpu_base->hang_detected = false;
+ expires_next = ktime_add_ns(ktime_get(),
+ min(100 * NSEC_PER_MSEC, cpu_base->max_hang_time));
}
hrtimer_rearm_event(expires_next, deferred);
}
@@ -2121,6 +2121,7 @@ void hrtimer_interrupt(struct clock_even
*/
now = hrtimer_update_base(cpu_base);
expires_next = hrtimer_update_next_event(cpu_base);
+ cpu_base->hang_detected = false;
if (expires_next < now) {
if (++retries < 3)
goto retry;