[patch 0/2] sched/idle: Prevent pointless NOHZ transitions in default_idle_call()
From: Thomas Gleixner
Date: Sun Mar 01 2026 - 14:30:51 EST
default_idle_call() is used when cpuidle is not available. That's the case
on most virtual machines.
It unconditionally tries to transition to NOHZ idle mode on every
invocation, which allows the hypervisor to go into long idle sleeps.
But that's counterproductive on a loaded system where CPUs go briefly idle
for a couple of microseconds. That causes to reprogram the clock event
device twice, one on entry and then when leaving idle a few microseconds
later. That's especially hurtful for VMs as programming the clock event
device implies a VM exit.
See also the related discussion here:
https://lore.kernel.org/875x7mv8wd.ffs@tglx
Cure this by implementing a moving average tracking idle time in
default_idle_call() and only stop the tick when the resulting average idle
time is larger than a tick.
The series applies on v7.0-rc1.
Thanks,
tglx
---
include/linux/cpuidle.h | 1
kernel/sched/idle.c | 65 ++++++++++++++++++++++++++++++++++++++++++------
2 files changed, 57 insertions(+), 9 deletions(-)