Re: [RFC/RFT][PATCH 6/7] sched: idle: Predict idle duration before stopping the tick

From: Peter Zijlstra
Date: Mon Mar 05 2018 - 08:37:48 EST


On Mon, Mar 05, 2018 at 08:19:15AM -0500, Rik van Riel wrote:
> > Also, I think that at this point you've introduced a problem; by not
> > disabling the tick unconditionally, we'll have extra wakeups due to
> > the (now still running) tick, which will bias the estimation, as per
> > reflect(), downwards.
> >
> > We should effectively discard tick wakeups when we could have
> > entered nohz but didn't, accumulating the idle period in reflect and
> > only commit once we get a !tick wakeup.
>
> How much of a problem would that actually be?
>
> Don't all but the very deepest C-states have
> target residencies that are orders of magnitude
> smaller than the tick period?
>
> In other words, if our sleeps end up getting
> "cut short" to 600us, we will still select C6,
> and it will not result in picking C3 by mistake.
>
> This only seems to affect C7 states and deeper.

On modern Intel, what about other platforms? This is something that
should work across the board.

> It may be worth fixing in the long run, but that
> would require keeping track of whether anything
> non-idle was done in-between two invocations of
> do_idle(), and then checking that there.
>
> That would include not just seeing whether there
> have been any context switches on the CPU (easy?),
> but also whether any non-timer interrupts were run.

Right, its the interrupts that are 'interesting' although I suppose we
could magic something in irq_enter().