Re: [RFC/RFT][PATCH 6/7] sched: idle: Predict idle duration before stopping the tick

From: Peter Zijlstra
Date: Mon Mar 05 2018 - 07:36:14 EST


On Sun, Mar 04, 2018 at 11:28:56PM +0100, Rafael J. Wysocki wrote:
> Index: linux-pm/kernel/sched/idle.c
> ===================================================================
> --- linux-pm.orig/kernel/sched/idle.c
> +++ linux-pm/kernel/sched/idle.c
> @@ -188,13 +188,14 @@ static void cpuidle_idle_call(void)
> } else {
> unsigned int duration_us;
>
> - tick_nohz_idle_go_idle(true);
> - rcu_idle_enter();
> -
> /*
> * Ask the cpuidle framework to choose a convenient idle state.
> */
> next_state = cpuidle_select(drv, dev, &duration_us);
> +
> + tick_nohz_idle_go_idle(duration_us > USEC_PER_SEC / HZ);

(FWIW we have TICK_USEC for this)

> + rcu_idle_enter();
> +
> entered_state = call_cpuidle(drv, dev, next_state);
> /*
> * Give the governor an opportunity to reflect on the outcome

Also, I think that at this point you've introduced a problem; by not
disabling the tick unconditionally, we'll have extra wakeups due to the
(now still running) tick, which will bias the estimation, as per
reflect(), downwards.

We should effectively discard tick wakeups when we could have entered
nohz but didn't, accumulating the idle period in reflect and only commit
once we get a !tick wakeup.

Of course, for that to work we need to somehow divine what woke us,
which is going to be tricky.