Re: [patch 2/2] sched/idle: Make default_idle_call() NOHZ aware
From: Christian Loehle
Date: Mon Mar 02 2026 - 06:03:59 EST
On 3/2/26 10:43, Frederic Weisbecker wrote:
> On Sun, Mar 01, 2026 at 08:30:51PM +0100, Thomas Gleixner wrote:
>> Guests fall back to default_idle_call() as there is no cpuidle driver
>> available to them by default. That causes a problem in fully loaded
>> scenarios where CPUs go briefly idle for a couple of microseconds:
>>
>> tick_nohz_idle_stop_tick() is invoked unconditionally which means unless
>> there is timer pending in the next tick, the tick is stopped and a couple
>> of microseconds later when the idle condition goes away restarted. That
>> requires to program the clockevent device twice which implies a VM exit for
>> each reprogramming.
>>
>> It was suggested to remove the tick_nohz_idle_stop_tick() invocation from
>> the default idle code, but would be counterproductive. It would not allow
>> the host to go into deeper idle states when the guest CPU is fully idle as
>> it has to maintain the periodic tick.
>>
>> Cure this by implementing a trivial moving average filter which keeps track
>> of the recent idle recidency time and only stop the tick when the average
>> is larger than a tick.
>>
>> Signed-off-by: Thomas Gleixner <tglx@xxxxxxxxxx>
>
> Shouldn't there be instead a new dedicated cpuidle driver with proper governor support?
I think a dummy cpuidle driver is an option, but calling into any governor
seems overkill IMO, it presents an option to the user where there really is
none (after all the cpuidle governor would just make a boolean decision as
there are no states).