Re: [RFC/RFT][PATCH 6/7] sched: idle: Predict idle duration before stopping the tick

From: Rik van Riel
Date: Mon Mar 05 2018 - 08:19:30 EST


On Mon, 2018-03-05 at 13:35 +0100, Peter Zijlstra wrote:
> On Sun, Mar 04, 2018 at 11:28:56PM +0100, Rafael J. Wysocki wrote:
> > Index: linux-pm/kernel/sched/idle.c
> > ===================================================================
> > --- linux-pm.orig/kernel/sched/idle.c
> > +++ linux-pm/kernel/sched/idle.c
> > @@ -188,13 +188,14 @@ static void cpuidle_idle_call(void)
> > } else {
> > unsigned int duration_us;
> >
> > - tick_nohz_idle_go_idle(true);
> > - rcu_idle_enter();
> > -
> > /*
> > * Ask the cpuidle framework to choose a
> > convenient idle state.
> > */
> > next_state = cpuidle_select(drv, dev,
> > &duration_us);
> > +
> > + tick_nohz_idle_go_idle(duration_us > USEC_PER_SEC
> > / HZ);
>
> (FWIW we have TICK_USEC for this)
>
> > + rcu_idle_enter();
> > +
> > entered_state = call_cpuidle(drv, dev,
> > next_state);
> > /*
> > * Give the governor an opportunity to reflect on
> > the outcome
>
> Also, I think that at this point you've introduced a problem; by not
> disabling the tick unconditionally, we'll have extra wakeups due to
> the
> (now still running) tick, which will bias the estimation, as per
> reflect(), downwards.
>
> We should effectively discard tick wakeups when we could have entered
> nohz but didn't, accumulating the idle period in reflect and only
> commit
> once we get a !tick wakeup.

How much of a problem would that actually be?

Don't all but the very deepest C-states have
target residencies that are orders of magnitude
smaller than the tick period?

In other words, if our sleeps end up getting
"cut short" to 600us, we will still select C6,
and it will not result in picking C3 by mistake.

This only seems to affect C7 states and deeper.

It may be worth fixing in the long run, but that
would require keeping track of whether anything
non-idle was done in-between two invocations of
do_idle(), and then checking that there.

That would include not just seeing whether there
have been any context switches on the CPU (easy?),
but also whether any non-timer interrupts were run.

--
All Rights Reversed.

Attachment: signature.asc
Description: This is a digitally signed message part