[patch] new-task-fix.patch, 2.6.8.1-mm1

From: Ingo Molnar
Date: Tue Aug 17 2004 - 03:47:07 EST



* Rusty Russell <rusty@xxxxxxxxxxxxxxx> wrote:

> Looking through 2.6.8.1-mm1, I see this code which doesn't make sense:

> So, first off, the statements under "if (unlikely(cpu != this_cpu))"
> can be folded into the previous block, since that's under the same
> test. Secondly, why is sleep_avg being set twice to the same thing,
> and why are we happy to adjust it the first time without holding the
> rq lock for current, but the second time we make sure we are holding
> the rq lock? [...]

agreed, this is a bug - the code has rotten somewhat. The attached patch
fixes it. I've also cleaned up the locking and added this_rq, to make
clear when and how we are hopping from one runqueue to another. (this
cleanup would have made the original bug more obvious as well.)

This comes after sched-nonlinear-timeslicespatch.patch in 2.6.8.1-mm1.
Tested on x86.

> [...] recalc_task_prio seems happy to adjust a tasks ->sleep_avg
> without holding any lock at all...

this is not true - we always update ->avg_sleep while holding the lock.
recalc_task_prio() is stricly called with p's runqueue lock held.

Ingo

Rusty noticed that we update the parent ->avg_sleep without holding the
runqueue lock. Also the code needed cleanups.

Signed-off-by: Ingo Molnar <mingo@xxxxxxx>

--- linux/kernel/sched.c.orig
+++ linux/kernel/sched.c
@@ -1344,7 +1344,7 @@ void fastcall wake_up_new_task(task_t *
{
unsigned long flags;
int this_cpu, cpu;
- runqueue_t *rq;
+ runqueue_t *rq, *this_rq;

rq = task_rq_lock(p, &flags);
cpu = task_cpu(p);
@@ -1386,8 +1386,15 @@ void fastcall wake_up_new_task(task_t *
} else
/* Run child last */
__activate_task(p, rq);
+ /*
+ * We skip the following code due to cpu == this_cpu
+ *
+ * task_rq_unlock(rq, &flags);
+ * this_rq = task_rq_lock(current, &flags);
+ */
+ this_rq = rq;
} else {
- runqueue_t *this_rq = cpu_rq(this_cpu);
+ this_rq = cpu_rq(this_cpu);

/*
* Not the local CPU - must adjust timestamp. This should
@@ -1399,18 +1406,17 @@ void fastcall wake_up_new_task(task_t *
if (TASK_PREEMPTS_CURR(p, rq))
resched_task(rq->curr);

- current->sleep_avg = JIFFIES_TO_NS(CURRENT_BONUS(current) *
- PARENT_PENALTY / 100 * MAX_SLEEP_AVG / MAX_BONUS);
schedstat_inc(rq, wunt_moved);
- }
-
- if (unlikely(cpu != this_cpu)) {
+ /*
+ * Parent and child are on different CPUs, now get the
+ * parent runqueue to update the parent's ->sleep_avg:
+ */
task_rq_unlock(rq, &flags);
- rq = task_rq_lock(current, &flags);
+ this_rq = task_rq_lock(current, &flags);
}
current->sleep_avg = JIFFIES_TO_NS(CURRENT_BONUS(current) *
PARENT_PENALTY / 100 * MAX_SLEEP_AVG / MAX_BONUS);
- task_rq_unlock(rq, &flags);
+ task_rq_unlock(this_rq, &flags);
}

/*