RE: [PATCH] cpuidle: Fix the CPU stuck at C0 for 2-3s after PM_QOS back to DEFAULT
From: Liu, Chuansheng
Date: Thu Aug 14 2014 - 07:24:21 EST
> -----Original Message-----
> From: Peter Zijlstra [mailto:peterz@xxxxxxxxxxxxx]
> Sent: Thursday, August 14, 2014 6:54 PM
> To: Daniel Lezcano
> Cc: Liu, Chuansheng; Rafael J. Wysocki; linux-pm@xxxxxxxxxxxxxxx; LKML; Liu,
> Changcheng; Wang, Xiaoming; Chakravarty, Souvik K
> Subject: Re: [PATCH] cpuidle: Fix the CPU stuck at C0 for 2-3s after PM_QOS
> back to DEFAULT
>
> On Thu, Aug 14, 2014 at 12:29:32PM +0200, Daniel Lezcano wrote:
> > Hi Chuansheng,
> >
> > On 14 August 2014 04:11, Chuansheng Liu <chuansheng.liu@xxxxxxxxx>
> wrote:
> >
> > > We found sometimes even after we let PM_QOS back to DEFAULT,
> > > the CPU still stuck at C0 for 2-3s, don't do the new suitable C-state
> > > selection immediately after received the IPI interrupt.
> > >
> > > The code model is simply like below:
> > > {
> > > pm_qos_update_request(&pm_qos, C1 - 1);
> > > < == Here keep all cores at C0
> > > ...;
> > > pm_qos_update_request(&pm_qos, PM_QOS_DEFAULT_VALUE);
> > > < == Here some cores still stuck at C0 for 2-3s
> > > }
> > >
> > > The reason is when pm_qos come back to DEFAULT, there is IPI interrupt to
> > > wake up the core, but when core is in poll idle state, the IPI interrupt
> > > can not break the polling loop.
> > >
> > > So here in the IPI callback interrupt, when currently the idle task is
> > > running, we need to forcedly set reschedule bit to break the polling loop,
> > > as for other non-polling idle state, IPI interrupt can break them directly,
> > > and setting reschedule bit has no harm for them too.
> > >
> > > With this fix, we saved about 30mV power in our android platform.
> > >
> > > Signed-off-by: Chuansheng Liu <chuansheng.liu@xxxxxxxxx>
> > > ---
> > > drivers/cpuidle/cpuidle.c | 8 +++++++-
> > > 1 file changed, 7 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c
> > > index ee9df5e..9e28a13 100644
> > > --- a/drivers/cpuidle/cpuidle.c
> > > +++ b/drivers/cpuidle/cpuidle.c
> > > @@ -532,7 +532,13 @@ EXPORT_SYMBOL_GPL(cpuidle_register);
> > >
> > > static void smp_callback(void *v)
> > > {
> > > - /* we already woke the CPU up, nothing more to do */
> > > + /* we already woke the CPU up, and when the corresponding
> > > + * CPU is at polling idle state, we need to set the sched
> > > + * bit to trigger reselect the new suitable C-state, it
> > > + * will be helpful for power.
> > > + */
> > > + if (is_idle_task(current))
> > > + set_tsk_need_resched(current);
> > >
> >
> > Mmh, shouldn't we inspect the polling flag instead ? Peter (Cc'ed) did some
> > changes around this and I think we should ask its opinion. I am not sure
> > this code won't make all cpu to return to the scheduler and go back to the
> > idle task.
>
> Yes, this is wrong.. Also cpuidle should not know about this, so this is
> very much the wrong place to go fix this. Lemme have a look.
If inspecting the polling flag, we can not fix the race between poll_idle and smp_callback,
since in poll_idle(), before set polling flag, if the smp_callback come in, then no resched bit set,
after that, poll_idle() will do the polling action, without reselection immediately, it will bring power
regression here.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/