Re: [BUG] TASK_DEAD task is able to be woken up in specialcondition

From: Peter Zijlstra
Date: Tue Jan 24 2012 - 05:55:42 EST


On Tue, 2012-01-24 at 11:19 +0100, Peter Zijlstra wrote:
> On Wed, 2012-01-18 at 15:20 +0100, Oleg Nesterov wrote:
> > do_exit() is different because it can not handle the spurious wakeup.
> > Well, may be we can? we can simply do
> >
> > for (;;) {
> > tsk->state = TASK_DEAD;
> > schedule();
> > }
> >
> > __schedule() can't race with ttwu() once it takes rq->lock. If the
> > exiting task is deactivated, finish_task_switch() will see EXIT_DEAD.
>
> TASK_DEAD, right?
>
> > Unless I missed something, the only problem is preempt_disable(),
> > but schedule_debug() checks ->exit_state.
> >
> > OTOH, if we fix this race then probably schedule_debug() should
> > check state == EXIT_DEAD instead.
>
> Hmm, interesting. On the up side that removes the need for that inf loop
> after BUG, down side is of course that we loose the BUG itself too. Now
> I'm not too sure we actually care about that, a task spinning at 100% in
> x state should be fairly obvious borkage and its not like we hit this
> thing very often.

Something like so, right? schedule_debug() already tests
prev->exit_state so it should DTRT afaict.

Also, while going over this again, I think Yasunori-San's patch isn't
sufficient, note how the p->state = TASK_RUNNING in ttwu_do_wakeup() can
happen outside of p->pi_lock when the task gets queued on a remote cpu.

---
kernel/exit.c | 17 +++++++++++------
1 files changed, 11 insertions(+), 6 deletions(-)

diff --git a/kernel/exit.c b/kernel/exit.c
index 294b170..ccd4f84 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -1039,13 +1039,18 @@ void do_exit(long code)
__this_cpu_add(dirty_throttle_leaks, tsk->nr_dirtied);
exit_rcu();
/* causes final put_task_struct in finish_task_switch(). */
- tsk->state = TASK_DEAD;
tsk->flags |= PF_NOFREEZE; /* tell freezer to ignore us */
- schedule();
- BUG();
- /* Avoid "noreturn function does return". */
- for (;;)
- cpu_relax(); /* For when BUG is null */
+ for (;;) {
+ /*
+ * A spurious wakeup, eg. generated by rwsem when down()'s call
+ * to schedule() doesn't happen but the wakeup from the
+ * previous owner's up() did, can stomp on our ->state.
+ *
+ * This loop also avoids "noreturn functions does return"
+ */
+ tsk->state = TASK_DEAD;
+ schedule();
+ }
}

EXPORT_SYMBOL_GPL(do_exit);


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/