Re: [PATCH v9 10/10] cpuidle: menu: Avoid selecting shallow states with stopped tick

From: Peter Zijlstra
Date: Thu Apr 05 2018 - 10:11:56 EST


On Thu, Apr 05, 2018 at 03:49:32PM +0200, Rafael J. Wysocki wrote:
> On Thu, Apr 5, 2018 at 2:47 PM, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> > On Wed, Apr 04, 2018 at 10:50:36AM +0200, Rafael J. Wysocki wrote:
> >> + if (tick_nohz_tick_stopped()) {
> >> + /*
> >> + * If the tick is already stopped, the cost of possible short
> >> + * idle duration misprediction is much higher, because the CPU
> >> + * may be stuck in a shallow idle state for a long time as a
> >> + * result of it. In that case say we might mispredict and try
> >> + * to force the CPU into a state for which we would have stopped
> >> + * the tick, unless the tick timer is going to expire really
> >> + * soon anyway.
> >
> > Wait what; the tick was stopped, therefore it _cannot_ expire soon.
> >
> > *confused*
> >
> > Did you mean s/tick/a/ ?
>
> Yeah, that should be "a timer".

*phew* ok, that makes a lot more sense ;-)

My only concern with this is that we can now be overly pessimistic. The
predictor might know that statistically it's very likely a device
interrupt will arrive soon, but because the tick is already disabled, we
don't dare trust it, causing possible excessive latencies.

Would an alternative be to make @stop_tick be an enum capable of forcing
the tick back on?

enum tick_action {
NOHZ_TICK_STOP,
NOHZ_TICK_RETAIN,
NOHZ_TICK_START,
};

enum tick_action tick_action = NOHZ_TICK_STOP;

state = cpuidle_select(..., &tick_action);

switch (tick_action) {
case NOHZ_TICK_STOP:
tick_nohz_stop_tick();
break;

case NOHZ_TICK_RETAIN:
tick_nozh_retain_tick();
break;

case NOHZ_TICK_START:
tick_nohz_start_tick();
break;
};


Or something along those lines?