Re: [PATCH v1] sched: idle: Consolidate the handling of two special cases

From: Rafael J. Wysocki

Date: Sat Mar 14 2026 - 07:31:02 EST


On Fri, Mar 13, 2026 at 4:45 PM Christian Loehle
<christian.loehle@xxxxxxx> wrote:
>
> On 3/13/26 15:28, Rafael J. Wysocki wrote:
> > On Fri, Mar 13, 2026 at 3:04 PM Christian Loehle
> > <christian.loehle@xxxxxxx> wrote:
> >>
> >> On 3/13/26 13:07, Rafael J. Wysocki wrote:
> >>> On Fri, Mar 13, 2026 at 1:53 PM Christian Loehle
> >>> <christian.loehle@xxxxxxx> wrote:
> >>>>
> >>>> On 3/13/26 12:25, Rafael J. Wysocki wrote:
> >>>>> From: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx>
> >>>>>
> >>>>> There are two special cases in the idle loop that are handled
> >>>>> inconsistently even though they are analogous.
> >>>>>
> >>>>> The first one is when a cpuidle driver is absent and the default CPU
> >>>>> idle time power management implemented by the architecture code is used.
> >>>>> In that case, the scheduler tick is stopped every time before invoking
> >>>>> default_idle_call().
> >>>>>
> >>>>> The second one is when a cpuidle driver is present, but there is only
> >>>>> one idle state in its table. In that case, the scheduler tick is never
> >>>>> stopped.
> >>>>>
> >>>>> Since each of these approaches leads to suboptimal choices in some
> >>>>> cases, reconcile them with the help of one simple heuristic. Namely,
> >>>>> stop the tick if the CPU has been woken up by it in the previous
> >>>>> iteration of the idle loop, or let it tick otherwise.>
> >>>>> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx>
> >>>>> ---
> >>>>>
> >>>>> Based on today's mainline.
> >>>>>
> >>>>> ---
> >>>>> kernel/sched/idle.c | 30 +++++++++++++++++++++---------
> >>>>> 1 file changed, 21 insertions(+), 9 deletions(-)
> >>>>>
> >>>>> --- a/kernel/sched/idle.c
> >>>>> +++ b/kernel/sched/idle.c
> >>>>> @@ -161,6 +161,14 @@ static int call_cpuidle(struct cpuidle_d
> >>>>> return cpuidle_enter(drv, dev, next_state);
> >>>>> }
> >>>>>
> >>>>> +static void idle_call_stop_or_retain_tick(bool stop_tick)
> >>>>> +{
> >>>>> + if (stop_tick || tick_nohz_tick_stopped())
> >>>>> + tick_nohz_idle_stop_tick();
> >>>>> + else
> >>>>> + tick_nohz_idle_retain_tick();
> >>>>> +}
> >>>>> +
> >>>>> /**
> >>>>> * cpuidle_idle_call - the main idle function
> >>>>> *
> >>>>> @@ -170,7 +178,7 @@ static int call_cpuidle(struct cpuidle_d
> >>>>> * set, and it returns with polling set. If it ever stops polling, it
> >>>>> * must clear the polling bit.
> >>>>> */
> >>>>> -static void cpuidle_idle_call(void)
> >>>>> +static void cpuidle_idle_call(bool stop_tick)
> >>>>> {
> >>>>> struct cpuidle_device *dev = cpuidle_get_device();
> >>>>> struct cpuidle_driver *drv = cpuidle_get_cpu_driver(dev);
> >>>>> @@ -186,7 +194,7 @@ static void cpuidle_idle_call(void)
> >>>>> }
> >>>>>
> >>>>> if (cpuidle_not_available(drv, dev)) {
> >>>>> - tick_nohz_idle_stop_tick();
> >>>>> + idle_call_stop_or_retain_tick(stop_tick);
> >>>>>
> >>>>> default_idle_call();
> >>>>> goto exit_idle;
> >>>>> @@ -222,17 +230,19 @@ static void cpuidle_idle_call(void)
> >>>>> next_state = cpuidle_find_deepest_state(drv, dev, max_latency_ns);
> >>>>> call_cpuidle(drv, dev, next_state);
> >>>>> } else if (drv->state_count > 1) {
> >>>>> - bool stop_tick = true;
> >>>>> + /*
> >>>>> + * stop_tick is expected to be true by default by cpuidle
> >>>>> + * governors, which allows them to select idle states with
> >>>>> + * target residency above the tick period length.
> >>>>> + */
> >>>>> + stop_tick = true;
> >>>>>
> >>>>> /*
> >>>>> * Ask the cpuidle framework to choose a convenient idle state.
> >>>>> */
> >>>>> next_state = cpuidle_select(drv, dev, &stop_tick);
> >>>>>
> >>>>> - if (stop_tick || tick_nohz_tick_stopped())
> >>>>> - tick_nohz_idle_stop_tick();
> >>>>> - else
> >>>>> - tick_nohz_idle_retain_tick();
> >>>>> + idle_call_stop_or_retain_tick(stop_tick);
> >>>>>
> >>>>> entered_state = call_cpuidle(drv, dev, next_state);
> >>>>> /*
> >>>>> @@ -240,7 +250,7 @@ static void cpuidle_idle_call(void)
> >>>>> */
> >>>>> cpuidle_reflect(dev, entered_state);
> >>>>> } else {
> >>>>> - tick_nohz_idle_retain_tick();
> >>>>> + idle_call_stop_or_retain_tick(stop_tick);
> >>>>
> >>>> This would supersede e5c9ffc6ae1b ("cpuidle: Skip governor when only one idle state is available")
> >>>> so we should remove that code too.
> >>>
> >>> Which code? Do you mean the one that has been removed by
> >>>
> >>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d557640e4ce589a24dca5ca7ce3b9680f471325f
> >>>
> >> Ah of course, sorry!
> >
> > No worries.
> >
> >> Reviewed-by: Christian Loehle <christian.loehle@xxxxxxx>
> >
> > So should I regard this as a fix for 7.0?
> >
> > I guess so because I don't think it would be useful to ship 7.0
> > without it only to change the behavior immediately in 7.1. And I
> > think that it can be treated as a fix for e5c9ffc6ae1b (above).
>
> Yes I think it being a fix for e5c9ffc6ae1b should be fine (fingers crossed).

OK

So I'm going to queue it up for the next -rc. If anyone has
objections, please let me know.