Re: [PATCH] cpuidle: Fix the CPU stuck at C0 for 2-3s after PM_QOS back to DEFAULT

From: Peter Zijlstra
Date: Thu Aug 14 2014 - 08:41:56 EST


On Thu, Aug 14, 2014 at 01:14:49PM +0200, Daniel Lezcano wrote:
> On 08/14/2014 01:00 PM, Peter Zijlstra wrote:
> >On Thu, Aug 14, 2014 at 12:29:32PM +0200, Daniel Lezcano wrote:
> >>Hi Chuansheng,
> >>
> >>On 14 August 2014 04:11, Chuansheng Liu <chuansheng.liu@xxxxxxxxx> wrote:
> >>
> >>>We found sometimes even after we let PM_QOS back to DEFAULT,
> >>>the CPU still stuck at C0 for 2-3s, don't do the new suitable C-state
> >>>selection immediately after received the IPI interrupt.
> >>>
> >>>The code model is simply like below:
> >>>{
> >>> pm_qos_update_request(&pm_qos, C1 - 1);
> >>> < == Here keep all cores at C0
> >>> ...;
> >>> pm_qos_update_request(&pm_qos, PM_QOS_DEFAULT_VALUE);
> >>> < == Here some cores still stuck at C0 for 2-3s
> >>>}
> >>>
> >>>The reason is when pm_qos come back to DEFAULT, there is IPI interrupt to
> >>>wake up the core, but when core is in poll idle state, the IPI interrupt
> >>>can not break the polling loop.
> >
> >So seeing how you're from @intel.com I'm assuming you're using x86 here.
> >
> >I'm not seeing how this can be possible, MWAIT is interrupted by IPIs
> >just fine, which means we'll fall out of the cpuidle_enter(), which
> >means we'll cpuidle_reflect(), and then leave cpuidle_idle_call().
> >
> >It will indeed not leave the cpu_idle_loop() function and go right back
> >into cpuidle_idle_call(), but that will then call cpuidle_select() which
> >should pick a new C state.
> >
> >So the interrupt _should_ work. If it doesn't you need to explain why.
>
> I think the issue is related to the poll_idle state, in
> drivers/cpuidle/driver.c. This state is x86 specific and inserted in the
> cpuidle table as the state 0 (POLL). There is no mwait for this state. It is
> a bit confusing because this state is not listed in the acpi / intel idle
> driver but inserted implicitly at the beginning of the idle table by the
> cpuidle framework when the driver is registered.
>
> static int poll_idle(struct cpuidle_device *dev,
> struct cpuidle_driver *drv, int index)
> {
> local_irq_enable();
> if (!current_set_polling_and_test()) {
> while (!need_resched())
> cpu_relax();
> }
> current_clr_polling();
>
> return index;
> }

Ah, well, in that case there's a ton more broken than just this.
kick_all_cpus_sync() won't work either, and cpuidle_reflect() pretty
much expects to be called after each interrupt.

Then again, not reflecting properly isn't really a problem, its not like
not accounting interrupts is going to safe power much.


Attachment: pgpLESEUjl514.pgp
Description: PGP signature