[PATCH 1/2] sched: schedule_tail() should disable preemption

From: Oleg Nesterov
Date: Wed Oct 08 2014 - 14:36:50 EST

finish_task_switch() enables preemption, so post_schedule(rq) can be
called on the wrong (and even dead) CPU. Afaics, nothing really bad
can happen, but in this case we can wrongly clear rq->post_schedule
on that CPU. And this simply looks wrong in any case.

Another problem is that finish_task_switch() itself runs with preempt
enabled after finish_lock_switch(). If nothing else this means that
->sched_in() notifier can't trust its "cpu" arg.

Signed-off-by: Oleg Nesterov <oleg@xxxxxxxxxx>
kernel/sched/core.c | 11 +++++------
1 files changed, 5 insertions(+), 6 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 703c7e6..3f267e8 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2277,15 +2277,14 @@ static inline void post_schedule(struct rq *rq)
asmlinkage __visible void schedule_tail(struct task_struct *prev)
- struct rq *rq = this_rq();
+ struct rq *rq;

+ /* finish_task_switch() drops rq->lock and enables preemtion */
+ preempt_disable();
+ rq = this_rq();
finish_task_switch(rq, prev);
- /*
- * FIXME: do we need to worry about rq being invalidated by the
- * task_switch?
- */
+ preempt_enable();

if (current->set_child_tid)
put_user(task_pid_vnr(current), current->set_child_tid);

