Re: [RESEND PATCH v1 1/2] cpuidle: menu: Correct the criteria for stopping tick
From: leo . yan
Date: Sun Aug 12 2018 - 12:08:18 EST
On Sun, Aug 12, 2018 at 01:12:41PM +0200, Rafael J. Wysocki wrote:
> On Fri, Aug 10, 2018 at 11:03 AM <leo.yan@xxxxxxxxxx> wrote:
> >
> > On Fri, Aug 10, 2018 at 04:49:06PM +0800, Leo Yan wrote:
> > > On Fri, Aug 10, 2018 at 09:22:10AM +0200, Rafael J. Wysocki wrote:
> > > > On Fri, Aug 10, 2018 at 9:13 AM, <leo.yan@xxxxxxxxxx> wrote:
> > > > > On Thu, Aug 09, 2018 at 10:47:17PM +0200, Rafael J. Wysocki wrote:
> > > > >> On Thu, Aug 9, 2018 at 7:20 PM, Leo Yan <leo.yan@xxxxxxxxxx> wrote:
> > > >
> > > > [cut]
> > > >
> > > > >> And that will cause the tick to be stopped unnecessarily in certain
> > > > >> situations, so why is this better?
> > > > >
> > > > > Let's see below two cases, the first one case we configure
> > > > > TICK_USEC=1000 (1ms) and the second case we configure TICK_USEC=4000
> > > > > (4ms).
> > > > >
> > > > > Let's assume we do the testing one the same platform and have two runs,
> > > > > in the Case 1 we configure HZ=1000 so TICK_USEC=1ms, expected_interval
> > > > > is 3ms and deepest idle state target residency is 2ms, finally the idle
> > > > > governor will choose the deepest state and skip to calibrate to shallow
> > > > > state caused by 'expected_interval' > TICK_USEC;
> > > > >
> > > > > In the Case 2 we configure HZ=250 so TICK_USE=4ms, expected_interval
> > > > > (3ms) and deepest idle state target residency (2ms) are same with the
> > > > > Case 1; but because expected_interval < TICK_USEC so the idle governor
> > > > > will do calibration to select a shallower state. If we image on one
> > > > > platform, the deepest idle state's target residency is smaller value,
> > > > > then it has bigger gap with TICK_USEC, the deepest idle state is harder
> > > > > to be selected due 'expected_interval' can be easily hit the range
> > > > > [Deepest target residency..TICK_USEC).
> > > > >
> > > > > This patch has no any change for Case 1 and it wants to optimize for
> > > > > Case 2 so Case 2 has chance to stay in deepest idle state. I
> > > > > understand from the performance pespective, we need to avoid to stop
> > > > > tick for shallow states; on the other hand we cannot prevent CPU run
> > > > > into deepest idle state just only we want to keep the tick running,
> > > > > especially the expected interval is longer than the deepest state
> > > > > target residency.
> > > > >
> > > > > Case 1:
> > > > > Deepest idle state's target residency=2ms
> > > > > |
> > > > > V
> > > > > |--------------------------------------------------------> time (ms)
> > > > > ^ ^
> > > > > | |
> > > > > TICK_USEC=1ms expected_interval=3ms
> > > > >
> > > > >
> > > > > Case 2:
> > > > > Deepest idle state's target residency = 2ms
> > > > > |
> > > > > V
> > > > > |--------------------------------------------------------> time (ms)
> > > > > ^ ^
> > > > > | |
> > > > > expected_interval = 3ms TICK_USEC = 4ms
> > > > >
> > > > >
> > > > >
> > > > >> > unsigned int delta_next_us = ktime_to_us(delta_next);
> > > > >> >
> > > > >> > *stop_tick = false;
> > > > >> > --
> > > >
> > > > Well, I don't quite agree with the approach here, then.
> > > >
> > > > As I said in the previous reply, IMO restarting the stopped tick
> > > > before leaving the loop in do_idle() is pointless overhead. It is not
> > > > necessary to do that to avoid leaving CPUs in shallow idle states for
> > > > too long (I'll send an alternative patch to fix this issue shortly).
> > > >
> > > > While you may think that pointless overhead is not a problem, I don't
> > > > quite agree with that.
> > >
> > > I disagree this patch will introduce any extra overhead.
> > >
> > > Firstly, the idle loop doesn't support restarting tick even this patch
> > > tells idle loop to restart the tick;
>
> I'm not talking about restarting the tick, but about stopping it more
> often on average.
Ah, yes, I agree.
> > > secondly this patch is mainly to
> > > resolve issue for the CPU cannot stay in deepest state in Case 2,
>
> I understand what you are trying to achieve here, but I don't agree with it.
I agree we need find more general method for fixing.
> The condition modified by this patch is not about how much time the
> CPU can potentially be idle, but about when it is expected to wake up.
> The "expected" part is really key here.
>
> The governor has gone through the effort of making an idle duration
> prediction and it now it has a certain expectation regarding when the
> CPU will wake up. If the governor's prediction is any good at all and
> this expectation is in the tick range, the CPU will be woken up by
> something close enough to the tick in the majority of cases, so there
> is no need to stop the tick. Not because the CPU cannot be idle
> longer, but because it is expected to wake up early enough anyway (and
> yes, you can argue that 2 times the tick range may still be "early
> enough" and so on, but then I'd like to see numbers in support of
> that).
Thanks for explaination; I also think this is good methodology, but
just want to improve a bit based on this. For example, the governor
is always to use 'expectation' to compare with TICK_USEC, TICK_USEC is a
predefined interval as a boundary, but in reality the tick incoming time
is in the range of [0..TICK_USEC]; so currently method we cannot make
decision according to the tick's delta in realtime.
I'd like take this issue as 'how to improve the decision for stopping
tick?', if we can make better decision for stopping tick, then it's
possible to resolve Case 2 and without stopping tick more offten, e.g.
the CPU even can run into deepest idle state without stopping the tick
if the prediction is less than the tick.
I will send out a new patch set based on these ideas for reviewing.
> Now, if the governor is junk and its predictions are useless, the
> above will not be the case any more, but then I'm not sure what the
> benefit from using that governor at all is. :-)
I really don't think the governor and predictions are useless :)
Just want to remind the side topic, after introducing tick in idle loop,
the tick also can impact the predictions (e.g. it have some impactions
on correction factors but need more time for modeling on this).
> > > as a side effect it also can tell idle loop to restart the tick for case 3
> > > in below, actually IMHO this makes sense to tell the idle loop to
> > > enable the tick but idle loop can ignore this info.
> > >
> > > Furthermore, we have another thread for the patch to always stop
> > > tick after the the tick has been stopped in the idle loop.
> > >
> > > So this patch is still valid.
> >
> > Correct for Case 3 as below, actually this case will disappear if we
> > force to set expected_interval=ktime_to_us(delta_next) in another
> > proposaled patch. If so, this patch will have no any chance to
> > introduce extra ticks.
>
> Yes, it will or at least it may.
>
> Assuming shot noise wakeups, if
> drv->states[drv->state_count-1].target_residency is less than
> TICK_USEC, the tick will be stopped for CPUs more often on average
> with the patch applied (simply because the idle duration range for
> which it will not be stopped is narrower then).
Yes, good point, so in the new approach I try to change the code
to compare with next tick delta rather than TICK_USEC, it can keeps
running tick for the tick with long expire time (similiar with
comparing with TICK_USEC) but we also can stop tick if the tick is likely
to break idle residency.
Thanks,
Leo Yan