2.6.32.y stable kernel regression with taskset

From: Yinghai Lu
Date: Wed Sep 15 2010 - 14:47:54 EST


found problem with cpuscaling test.

Under 2.6.32.21 Userspace gov
min freq load test time is nearly the same as max freq load test time around ~16 seconds

under 2.6.18-194
min freq load test time is ~40 seconds
max freq load test time is ~ 17 seconds

the test is
1. set governor for one cpu to userspace
2. set freq to min for that cpu
3. using taskset to put load test only on that cpu, and get the time for load test.

so that mean taskset did not put load test on cpu that we want. and other cpu still have ondemand governor and load test get done much faster

git bisect report:

c6fc81afa2d7ef2f775e48672693d8a0a8a7325d is the first bad commit
commit c6fc81afa2d7ef2f775e48672693d8a0a8a7325d
Author: John Wright <john.wright@xxxxxx>
Date: Tue Apr 13 16:55:37 2010 -0600

sched: Fix a race between ttwu() and migrate_task()

Based on commit e2912009fb7b715728311b0d8fe327a1432b3f79 upstream, but
done differently as this issue is not present in .33 or .34 kernels due
to rework in this area.

If a task is in the TASK_WAITING state, then try_to_wake_up() is working
on it, and it will place it on the correct cpu.

This commit ensures that neither migrate_task() nor __migrate_task()
calls set_task_cpu(p) while p is in the TASK_WAKING state. Otherwise,
there could be two concurrent calls to set_task_cpu(p), resulting in
the task's cfs_rq being inconsistent with its cpu.

Signed-off-by: John Wright <john.wright@xxxxxx>
Cc: Ingo Molnar <mingo@xxxxxxx>
Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxx>

:040000 040000 a9d18950d8edddb761c0266706f671f0e9a006fe 2c3a7d7d5e616ecc276b4e93f4b6e5162a9382c8 M kernel

diff --git a/kernel/sched.c b/kernel/sched.c
index 2591562..3261c19 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -2116,12 +2116,10 @@ migrate_task(struct task_struct *p, int dest_cpu, struct

/*
* If the task is not on a runqueue (and not running), then
- * it is sufficient to simply update the task's cpu field.
+ * the next wake-up will properly place the task.
*/
- if (!p->se.on_rq && !task_running(rq, p)) {
- set_task_cpu(p, dest_cpu);
+ if (!p->se.on_rq && !task_running(rq, p))
return 0;
- }

init_completion(&req->done);
req->task = p;
@@ -7167,6 +7165,9 @@ static int __migrate_task(struct task_struct *p, int src_c
/* Already moved. */
if (task_cpu(p) != src_cpu)
goto done;
+ /* Waking up, don't get in the way of try_to_wake_up(). */
+ if (p->state == TASK_WAKING)
+ goto fail;
/* Affinity changed (again). */
if (!cpumask_test_cpu(dest_cpu, &p->cpus_allowed))
goto fail;


after reverting it, cpuscaling can pass the test.

BTW, currently mainline is ok.

Thanks

Yinghai Lu
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/