Re: [PATCH v1] kthread/smpboot: Serialize kthread parking against wakeup
From: Oleg Nesterov
Date: Tue Jun 05 2018 - 12:35:24 EST
On 06/05, Peter Zijlstra wrote:
> On Tue, Jun 05, 2018 at 05:22:12PM +0200, Peter Zijlstra wrote:
> > > OK, but __kthread_parkme() can be preempted before it calls schedule(), so the
> > > caller still can be migrated? Plus kthread_park_complete() can be called twice.
> > Argh... I forgot TASK_DEAD does the whole thing with preempt_disable().
> > Let me stare at that a bit.
> This should ensure we only ever complete when we read PARKED, right?
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 8d59b259af4a..e513b4600796 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -2641,7 +2641,7 @@ prepare_task_switch(struct rq *rq, struct task_struct *prev,
> * past. prev == current is still correct but we need to recalculate this_rq
> * because prev may have moved to another CPU.
> -static struct rq *finish_task_switch(struct task_struct *prev)
> +static struct rq *finish_task_switch(struct task_struct *prev, bool preempt)
> struct rq *rq = this_rq();
> @@ -2674,7 +2674,7 @@ static struct rq *finish_task_switch(struct task_struct *prev)
> * We must observe prev->state before clearing prev->on_cpu (in
> * finish_task), otherwise a concurrent wakeup can get prev
> - * running on another CPU and we could rave with its RUNNING -> DEAD
> + * running on another CPU and we could race with its RUNNING -> DEAD
> * transition, resulting in a double drop.
> prev_state = prev->state;
> @@ -2720,7 +2720,8 @@ static struct rq *finish_task_switch(struct task_struct *prev)
> case TASK_PARKED:
> - kthread_park_complete(prev);
> + if (!preempt)
> + kthread_park_complete(prev);
Yes, but this won't fix the race decribed by Kohli...
Plus this complicates the schedule() paths for the very special case, and to me
it seems that all this kthread_park/unpark logic needs some serious cleanups...
Not that I can suggest something better right now.