Re: [PATCH V3 2/6] sched: idle: cpuidle: Check the latency req before idle

From: Daniel Lezcano
Date: Mon Nov 10 2014 - 12:26:40 EST


On 11/10/2014 05:15 PM, Peter Zijlstra wrote:
On Mon, Nov 10, 2014 at 04:58:29PM +0100, Daniel Lezcano wrote:
I don't get it, I've clearly not stared at it long enough, but why is
that STATE_START crap needed anywhere?

Excellent question :)

On x86, the config option ARCH_HAS_CPU_RELAX is set (x86 is the only one).
That leads to the macro CPUIDLE_DRIVER_STATE_START equal 1.

https://git.kernel.org/cgit/linux/kernel/git/rafael/linux-pm.git/tree/include/linux/cpuidle.h#n221

Then the acpi cpuidle driver and the intel_driver begin to fill the idle
state at index == CPUIDLE_DRIVER_STATE_START, so leaving the 0th idle state
empty.

https://git.kernel.org/cgit/linux/kernel/git/rafael/linux-pm.git/tree/drivers/idle/intel_idle.c#n848

https://git.kernel.org/cgit/linux/kernel/git/rafael/linux-pm.git/tree/drivers/acpi/processor_idle.c#n953

Then when the driver is registered and if ARCH_HAS_CPU_RELAX is set, the
cpuidle framework insert the 0th with the poll state (reminder : only for
x86).

Appears to me part of the problem is right there, the intel_idle and
proessor_idle should register the poll state themselves. That
immediately makes this weirdness go away.

Registering states from two places is not something that's sane or
desired I think.

I fully agree. I did a patchset in this direction but I realized the poll state most of the cases was not used.

https://git.kernel.org/cgit/linux/kernel/git/rafael/linux-pm.git/tree/drivers/cpuidle/driver.c#n195

If you look at the ladder governor (which I believe nobody is still using
it), or at the menu governor, all the indexes begin at
CPUIDLE_DRIVER_STATE_START, so all the code is preventing to use the 0th
state ... :)

So why is needed the poll state ?

1. When the latency_req is 0 (it returns 0, so the poll state)

Right, that makes sense.

2. When the select's menu governor fails to find a state *and* if the next
timer is before 5us

That seems rather arbitrary.

Yeah, and I am wondering if this very particular optimization couldn't be just removed. Is there still a benefit ?

Why would it fail to find a state?

To do short: it could fail to find a state fulfilling the constraints (some states could be disabled or the ).

https://git.kernel.org/cgit/linux/kernel/git/rafael/linux-pm.git/tree/drivers/cpuidle/driver.c#n195

And when we investigate the same code but on the other archs, the
CPUIDLE_DRIVER_STATE_START dance makes things slightly different.

So the conclusion is, we are inserting a state in the idle state array but
we do everything to prevent to use it :)

For me it appears logical to just kill this state from the x86 idle drivers
and move it in the idle_mainloop in case an idle state selection fails.

But why, ppc has a latency_req==0 state too, right?

Yes, this is why the arch_cpu_poll_idle, so ppc can override it with its optimized pooling loop (rep(); nop(); is x86).

I agree that we should shoot STATE_START in the head, but I feel we
should do it by fixing the state registration.

You can fix the state registration but that won't fix the STATE_START usage in the governors.

I really don't get why the governors should know about this though, its
just another state, they should iterate all states and pick the best,
given the power usage this state should really never be eligible unless
we're QoS forced or whatnot.

The governors just don't use the poll state at all, except for a couple of cases in menu.c defined above in the previous email. What is the rational of adding a state in the cpuidle driver and do everything we can to avoid using it ? From my POV, the poll state is a special state, we should remove from the driver's idle states like the arch_cpu_idle() is a specific idle state only used in idle.c (but which may overlap with an idle state in different archs eg. cpu_do_idle() and the 0th idle state).


--
<http://www.linaro.org/> Linaro.org â Open source software for ARM SoCs

Follow Linaro: <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/