Re: [PATCH wq/for-3.14-fixes] workqueue: ensure @task is valid across kthread_stop()

From: Lai Jiangshan
Date: Tue Feb 18 2014 - 22:37:19 EST


On 02/19/2014 05:37 AM, Tejun Heo wrote:
> Hello, Lai.
>
> I massaged the patch a bit and applied it to wq/for-3.14-fixes.
>
> Thanks.
> -------- 8< --------
>>From 5bdfff96c69a4d5ab9c49e60abf9e070ecd2acbb Mon Sep 17 00:00:00 2001
> From: Lai Jiangshan <laijs@xxxxxxxxxxxxxx>
> Date: Sat, 15 Feb 2014 22:02:28 +0800
>
> When a kworker should die, the kworkre is notified through WORKER_DIE
> flag instead of kthread_should_stop(). This, IIRC, is primarily to
> keep the test synchronized inside worker_pool lock. WORKER_DIE is
> first set while holding pool->lock, the lock is dropped and
> kthread_stop() is called.
>
> Unfortunately, this means that there's a slight chance that the target
> kworker may see WORKER_DIE before kthread_stop() finishes and exits
> and frees the target task before or during kthread_stop().
>
> Fix it by pinning the target task before setting WORKER_DIE and
> putting it after kthread_stop() is done.
>
> tj: Improved patch description and comment. Moved pinning above
> WORKER_DIE for better signify what it's protecting.
>
> CC: stable@xxxxxxxxxxxxxxx

I think no one hit this bug. So I add this stable TAG?

(Jason's bug-report drives me to review the workqueue harder,
and I found this possible bug, but I think it is irrespective
with Jason's bug-report.)

> Signed-off-by: Lai Jiangshan <laijs@xxxxxxxxxxxxxx>
> Signed-off-by: Tejun Heo <tj@xxxxxxxxxx>
> ---
> kernel/workqueue.c | 7 +++++++
> 1 file changed, 7 insertions(+)
>
> diff --git a/kernel/workqueue.c b/kernel/workqueue.c
> index 82ef9f3..193e977 100644
> --- a/kernel/workqueue.c
> +++ b/kernel/workqueue.c
> @@ -1851,6 +1851,12 @@ static void destroy_worker(struct worker *worker)
> if (worker->flags & WORKER_IDLE)
> pool->nr_idle--;
>
> + /*
> + * Once WORKER_DIE is set, the kworker may destroy itself at any
> + * point. Pin to ensure the task stays until we're done with it.
> + */
> + get_task_struct(worker->task);
> +
> list_del_init(&worker->entry);
> worker->flags |= WORKER_DIE;
>
> @@ -1859,6 +1865,7 @@ static void destroy_worker(struct worker *worker)
> spin_unlock_irq(&pool->lock);
>
> kthread_stop(worker->task);
> + put_task_struct(worker->task);
> kfree(worker);
>
> spin_lock_irq(&pool->lock);

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/