Re: finish_task_switch && prev_state (Was: sched, timers: use after free in __lock_task_sighand when exiting a process)

From: Oleg Nesterov
Date: Tue Jul 15 2014 - 10:27:38 EST


On 07/15, Peter Zijlstra wrote:
>
> @@ -2211,13 +2211,15 @@ static void finish_task_switch(struct rq *rq, struct task_struct *prev)
>
> /*
> * A task struct has one reference for the use as "current".
> + *
> * If a task dies, then it sets TASK_DEAD in tsk->state and calls
> - * schedule one last time. The schedule call will never return, and
> - * the scheduled task must drop that reference.
> - * The test for TASK_DEAD must occur while the runqueue locks are
> - * still held, otherwise prev could be scheduled on another cpu, die
> - * there before we look at prev->state, and then the reference would
> - * be dropped twice.
> + * schedule one last time. The schedule call will never return, and the
> + * scheduled task must drop that reference.
> + *
> + * The test for TASK_DEAD must occur while the runqueue locks are still
> + * held, otherwise we can race with RUNNING -> DEAD transitions, and
> + * then the reference would be dropped twice.
> + *
> * Manfred Spraul <manfred@xxxxxxxxxxxxxxxx>
> */

Agreed, this looks much more understandable!


And probably I missed something again, but it seems that this logic is broken
with __ARCH_WANT_UNLOCKED_CTXSW.

Of course, even if I am right this is pure theoretical, but smp_wmb() before
"->on_cpu = 0" is not enough and we need a full barrier ?

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/