[PATCH] fix wait_task_inactive race (was Re: Race condition in ptrace)

From: Nick Piggin
Date: Sat Feb 05 2005 - 22:29:10 EST


Nick Piggin wrote:

Something like the following (untested) extension of Bodo's work
could be the minimal fix for 2.6.11. As I've said though, I'd
consider it a hack and prefer to do something about the locking.
That could be done after 2.6.11 though. Depends how you feel.


I think this is the right fix.

When a task is put to sleep, it is dequeued from the runqueue
while it is still running. The problem is that the runqueue
lock can be dropped and retaken in schedule() before the task
actually schedules off, and wait_task_inactive did not account
for this.

I introduced a new function to resolve this state, fixed
wait_task_inactive, and converted over an open coded test.

I did a quick audit of sched.c, and nothing else seems to have
made the same mistake.

Signed-off-by: Nick Piggin <nickpiggin@xxxxxxxxxxxx>


Question: why does wait_task_inactive have different semantics
for UP && PREEMPT than SMP && PREEMPT? I can see that the kthread
caler probably isn't used in the UP case, but technically it is
relying on behaviour that it doesn't get with UP and PREEMPT.
Looks like the ptrace.c caller won't care.

But still, can we either fix it or put a nice comment there?
Preferably fix, if this isn't a very performance critical path?




---

linux-2.6-npiggin/kernel/sched.c | 12 ++++++++++--
1 files changed, 10 insertions(+), 2 deletions(-)

diff -puN kernel/sched.c~sched-fixup-races kernel/sched.c
--- linux-2.6/kernel/sched.c~sched-fixup-races 2005-02-06 14:03:53.000000000 +1100
+++ linux-2.6-npiggin/kernel/sched.c 2005-02-06 14:06:43.000000000 +1100
@@ -298,6 +298,14 @@ static DEFINE_PER_CPU(struct runqueue, r
#endif

/*
+ * Is the task currently running or on the runqueue
+ */
+static int task_onqueue(runqueue_t *rq, task_t *p)
+{
+ return (p->array || task_running(rq, p));
+}
+
+/*
* task_rq_lock - lock the runqueue a given task resides on and disable
* interrupts. Note the ordering: we can safely lookup the task_rq without
* explicitly disabling preemption.
@@ -836,7 +844,7 @@ static int migrate_task(task_t *p, int d
* If the task is not on a runqueue (and not running), then
* it is sufficient to simply update the task's cpu field.
*/
- if (!p->array && !task_running(rq, p)) {
+ if (!task_onqueue(rq, p)) {
set_task_cpu(p, dest_cpu);
return 0;
}
@@ -867,7 +875,7 @@ void wait_task_inactive(task_t * p)
repeat:
rq = task_rq_lock(p, &flags);
/* Must be off runqueue entirely, not preempted. */
- if (unlikely(p->array)) {
+ if (unlikely(task_onqueue(rq, p))) {
/* If it's preempted, we yield. It could be a while. */
preempted = !task_running(rq, p);
task_rq_unlock(rq, &flags);

_