Re: [PATCH RFC] tick/sched: Prevent pointless NOHZ transitions

From: Rafael J. Wysocki

Date: Tue Feb 24 2026 - 16:31:26 EST


On Tuesday, February 24, 2026 5:13:06 PM CET Thomas Gleixner wrote:
> On Tue, Feb 24 2026 at 09:35, Christian Loehle wrote:
> > On 2/24/26 08:32, Thomas Gleixner wrote:
> >> This happens with both TEO and MENU governors in a VM guest. That's not
> >> only pointless it's also a performance issue as each rearm of the timer
> >> implies a VM exit.
> >
> > This is the (drv->state_count <= 1) case I assume, no governor does anything
> > sensible in that case.
>
> Indeed.
>
> > I was also curious about the performance angle recently FWIW, but didn't
> > hear back:
> > https://lore.kernel.org/all/73439919-e24d-4bd5-a7ed-d7633beb5e4f@xxxxxxx/
>
> Sure, but I can tell you that two VM exits for a 10us idle are really
> harming performance a lot. That's why I noticed.
>
> >> Keep track of the idle time with a moving average and check it for being
> >> larger than TICK_NSEC in can_stop_idle_tick(). That cures this behaviour
> >> while still allowing the system to go into long idle sleeps once the
> >> work load stopped.
> >>
> >> Signed-off-by: Thomas Gleixner <tglx@xxxxxxxxxx>
> >> ---
> >> kernel/time/tick-sched.c | 20 +++++++++++++++++---
> >> kernel/time/tick-sched.h | 9 +++++++++
> >> 2 files changed, 26 insertions(+), 3 deletions(-)
> >
> > Why here and not in cpuidle?
>
> I don't care where it is fixed, that's why I marked it RFC
>
> > We've recently added some code for the single state case to skip
> > governor see
>
> Duh. I just noticed, the VM has no driver, so this will not end up in
> cpuidle_select(). No wonder that changing the governor has no effect :)
>
> I set the governor to haltpoll now, but that does not work either as the
> stupid haltpoll driver is built in and not activated as it requires the
> force parameter unless the KVM hypervisor has KVM_HINTS_REALTIME set.
>
> Brilliant, intuitive and truly user friendly stuff all that.
>
> It's amazing as always that all the "performance experts" who cry murder
> on everything else never noticed this completely nonsensical default
> behaviour.
>
> Force enabling that driver and setting the governor to 'teo' makes it go
> away. 'menu' still sucks pretty much the same way as with none; slightly
> less so, but often enough.
>
> > e5c9ffc6ae1b ("cpuidle: Skip governor when only one idle state is available")
> > where that could also live.
>
> So either ladder or the powernv driver is broken and that gets fixed in
> the cpuidle core. Interesting choice.
>
> But as I explained above adding something to this hack won't help for
> the VM case with no driver active because cpuidle_not_available() is
> true and idle ends up in default_idle_call().
>
> So either the governor/driver muck provides some sensible default
> implementation or this has to go into into default_idle_call().
>
> Oh well...

It looks like the issue is cause by the tick_nohz_idle_stop_tick() called right
before invoking default_idle_call().

After the recent changes mentioned above, cpuidle_select() will never stop the
tick when there's only one idle state in the cpuidle driver, so it would be
consistent to make the default case behave analogously. The default idle state
is never a deep one AFAICS.

So maybe something like the below?

---
kernel/sched/idle.c | 2 --
1 file changed, 2 deletions(-)

--- a/kernel/sched/idle.c
+++ b/kernel/sched/idle.c
@@ -186,8 +186,6 @@ static void cpuidle_idle_call(void)
}

if (cpuidle_not_available(drv, dev)) {
- tick_nohz_idle_stop_tick();
-
default_idle_call();
goto exit_idle;
}