[PATCH 1/3] set EXIT_DEAD state in do_exit(), not in schedule()

From: Oleg Nesterov
Date: Sat Aug 26 2006 - 10:52:14 EST


schedule() checks PF_DEAD on every context switch and sets ->state = EXIT_DEAD
to ensure that the exiting task will be deactivated. Note that this EXIT_DEAD
is in fact a "random" value, we can use any bit except normal TASK_XXX values.

It is better to set this state in do_exit() along with PF_DEAD flag and remove
that check in schedule().

We are safe wrt concurrent try_to_wake_up() (for example ptrace, tkill), it can
not change task's ->state: the 'state' argument of try_to_wake_up() can't have
EXIT_DEAD bit. And in case when try_to_wake_up() sees a stale value of ->state
== TASK_RUNNING it will do nothing.

Signed-off-by: Oleg Nesterov <oleg@xxxxxxxxxx>

--- 2.6.18-rc4/kernel/exit.c~1_state 2006-08-19 23:47:01.000000000 +0400
+++ 2.6.18-rc4/kernel/exit.c 2006-08-26 20:38:04.000000000 +0400
@@ -957,6 +957,7 @@ fastcall NORET_TYPE void do_exit(long co
preempt_disable();
BUG_ON(tsk->flags & PF_DEAD);
tsk->flags |= PF_DEAD;
+ tsk->state = EXIT_DEAD;

schedule();
BUG();
--- 2.6.18-rc4/kernel/sched.c~1_state 2006-08-20 00:37:27.000000000 +0400
+++ 2.6.18-rc4/kernel/sched.c 2006-08-26 20:39:02.000000000 +0400
@@ -3311,9 +3311,6 @@ need_resched_nonpreemptible:

spin_lock_irq(&rq->lock);

- if (unlikely(prev->flags & PF_DEAD))
- prev->state = EXIT_DEAD;
-
switch_count = &prev->nivcsw;
if (prev->state && !(preempt_count() & PREEMPT_ACTIVE)) {
switch_count = &prev->nvcsw;

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/