Re: [PATCH V2 1/5] sched: idle: cpuidle: Check the latency req before idle

From: Daniel Lezcano
Date: Wed Nov 05 2014 - 09:28:45 EST

Next message: Daniel Mack: "Re: [PATCH 00/12] Add kdbus implementation"
Previous message: Joerg Roedel: "[PATCH] powerpc/iommu: Rename iommu_[un]map_sg functions"
Next in thread: Preeti U Murthy: "Re: [PATCH V2 1/5] sched: idle: cpuidle: Check the latency req before idle"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On 10/29/2014 03:01 AM, Preeti U Murthy wrote:

On 10/29/2014 12:29 AM, Daniel Lezcano wrote:

On 10/28/2014 04:51 AM, Preeti Murthy wrote:

Hi Daniel,

On Thu, Oct 23, 2014 at 2:31 PM, Daniel Lezcano
<daniel.lezcano@xxxxxxxxxx> wrote:

When the pmqos latency requirement is set to zero that means "poll in
all the
cases".

That is correctly implemented on x86 but not on the other archs.

As how is written the code, if the latency request is zero, the
governor will
return zero, so corresponding, for x86, to the poll function, but for
the
others arch the default idle function. For example, on ARM this is
wait-for-
interrupt with a latency of '1', so violating the constraint.

This is not true actually. On PowerPC the idle state 0 has an
exit_latency of 0.

In order to fix that, do the latency requirement check *before*
calling the
cpuidle framework in order to jump to the poll function without entering
cpuidle. That has several benefits:

Doing so actually hurts on PowerPC. Because the idle loop defined for
idle state 0 is different from what cpu_relax() does in cpu_idle_loop().
The spinning is more power efficient in the former case. Moreover we
also set
certain register values which indicate an idle cpu. The ppc_runlatch bits
do precisely this. These register values are being read by some user
space
tools. So we will end up breaking them with this patch

My suggestion is very well keep the latency requirement check in
kernel/sched/idle.c
like your doing in this patch. But before jumping to cpu_idle_loop
verify if the
idle state 0 has an exit_latency > 0 in addition to your check on the
latency_req == 0.
If not, you can fall through to the regular path of calling into the
cpuidle driver.
The scheduler can query the cpuidle_driver structure anyway.

What do you think?

Thanks for reviewing the patch and spotting this.

Wouldn't make sense to create:

void __weak_cpu_idle_poll(void) ?

and override it with your specific poll function ?

No this would become ugly as far as I can see. A weak function has to be
defined under arch/* code. We will either need to duplicate the idle
loop that we already have in the drivers or point the weak function to
the first idle state defined by our driver. Both of which is not
desirable (calling into the driver from arch code is ugly). Another
reason why I don't like the idea of a weak function is that if you have
missed looking at a specific driver and they have an idle loop with
features similar to on powerpc, you will have to spot it yourself and
include the arch specific cpu_idle_poll() for them.

Yes, I agree this is a fair point. But actually I don't see the interest of having the poll loop in the cpuidle driver. These cleanups are preparing the removal of the CPUIDLE_DRIVER_STATE_START macro which leads to a lot of mess in the cpuidle code.

With the removal of this macro, we should be able to move the select loop from the menu governor and use it everywhere else. Furthermore, this state which is flagged with TIME_VALID, isn't because the local interrupt are enabled so we are measuring the interrupt time processing.
Beside that the idle loop for x86 is mostly not used.

So the idea would be to extract those idle loop from the drivers and use them directly when:
1. the idle selection fails (use the poll loop under certain circumstances we have to redefine)
2. when the latency req is zero

That will result in a cleaner code in cpuidle and in the governor.

Do you agree with that ?

But by having a check on the exit_latency, you are claiming that since
the driver's 0th idle state is no better than the generic idle loop in
cases of 0 latency req, we are better off calling the latter, which
looks reasonable. That way you don't have to bother about worsening the
idle loop behavior on any other driver.

--
<http://www.linaro.org/> Linaro.org â Open source software for ARM SoCs

Follow Linaro: <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Daniel Mack: "Re: [PATCH 00/12] Add kdbus implementation"
Previous message: Joerg Roedel: "[PATCH] powerpc/iommu: Rename iommu_[un]map_sg functions"
Next in thread: Preeti U Murthy: "Re: [PATCH V2 1/5] sched: idle: cpuidle: Check the latency req before idle"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]