[PATCH 2.6.26.y] cpuidle: 40000 wake/s unless idle=nomwait

From: Len Brown
Date: Sat Feb 21 2009 - 12:08:31 EST


From: Venkatesh Pallipadi <venkatesh.pallipadi@xxxxxxxxx>

upstream 2.6.27: 320eee776357db52d6fcfb11cff985b1976a4595
"cpuidle: Menu governor fix wrong usage of measured_us"
fixes http://bugzilla.kernel.org/show_bug.cgi?id=10914
"40000 wake/s unless idle=nomwait"


There is a bug in menu governor where we have
if (data->elapsed_us < data->elapsed_us + measured_us)

with measured_us already having elapsed_us added in tickless case here
unsigned int measured_us =
cpuidle_get_last_residency(dev) + data->elapsed_us;

Also, it should be last_residency, not measured_us, that need to be used to
do comparing and distinguish between expected & non-expected events.

Refactor menu_reflect() to fix these two problems.

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@xxxxxxxxx>
Signed-off-by: Wei Gang <gang.wei@xxxxxxxxx>
Signed-off-by: Andi Kleen <ak@xxxxxxxxxxxxxxx>
Signed-off-by: Len Brown <len.brown@xxxxxxxxx>

---
drivers/cpuidle/governors/menu.c | 31 +++++++++++++++++++------------
1 files changed, 19 insertions(+), 12 deletions(-)

Index: linux-2.6.26.y/drivers/cpuidle/governors/menu.c
===================================================================
--- linux-2.6.26.y.orig/drivers/cpuidle/governors/menu.c
+++ linux-2.6.26.y/drivers/cpuidle/governors/menu.c
@@ -67,9 +67,9 @@ static void menu_reflect(struct cpuidle_
{
struct menu_device *data = &__get_cpu_var(menu_devices);
int last_idx = data->last_state_idx;
- unsigned int measured_us =
- cpuidle_get_last_residency(dev) + data->elapsed_us;
+ unsigned int last_idle_us = cpuidle_get_last_residency(dev);
struct cpuidle_state *target = &dev->states[last_idx];
+ unsigned int measured_us;

/*
* Ugh, this idle state doesn't support residency measurements, so we
@@ -77,20 +77,27 @@ static void menu_reflect(struct cpuidle_
* for one full standard timer tick. However, be aware that this
* could potentially result in a suboptimal state transition.
*/
- if (!(target->flags & CPUIDLE_FLAG_TIME_VALID))
- measured_us = USEC_PER_SEC / HZ;
+ if (unlikely(!(target->flags & CPUIDLE_FLAG_TIME_VALID)))
+ last_idle_us = USEC_PER_SEC / HZ;

- /* Predict time remaining until next break event */
- if (measured_us + BREAK_FUZZ < data->expected_us - target->exit_latency) {
- data->predicted_us = max(measured_us, data->last_measured_us);
+ /*
+ * measured_us and elapsed_us are the cumulative idle time, since the
+ * last time we were woken out of idle by an interrupt.
+ */
+ if (data->elapsed_us <= data->elapsed_us + last_idle_us)
+ measured_us = data->elapsed_us + last_idle_us;
+ else
+ measured_us = -1;
+
+ /* Predict time until next break event */
+ data->predicted_us = max(measured_us, data->last_measured_us);
+
+ if (last_idle_us + BREAK_FUZZ <
+ data->expected_us - target->exit_latency) {
data->last_measured_us = measured_us;
data->elapsed_us = 0;
} else {
- if (data->elapsed_us < data->elapsed_us + measured_us)
- data->elapsed_us = measured_us;
- else
- data->elapsed_us = -1;
- data->predicted_us = max(measured_us, data->last_measured_us);
+ data->elapsed_us = measured_us;
}
}

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/