[PATCH 2/2 V4] workqueue: fix possible race condition when rescuer VS pwq-release

From: Lai Jiangshan
Date: Fri Apr 18 2014 - 09:22:45 EST


There is a race condition between rescuer_thread() and
pwq_unbound_release_workfn().

The works of the @pwq may be processed by some other workers,
and @pwq is scheduled to release(due to its wq's attr is changed)
before the rescuer starts to process. In this case
pwq_unbound_release_workfn() will corrupt wq->maydays list,
and rescuer_thead() will access to corrupted data.

Using get_pwq() pin it until rescuer is done with it.

Signed-off-by: Lai Jiangshan <laijs@xxxxxxxxxxxxxx>
---
kernel/workqueue.c | 13 +++++++++++++
1 files changed, 13 insertions(+), 0 deletions(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 7539244..8c0830c 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -1916,6 +1916,16 @@ static void send_mayday(struct work_struct *work)

/* mayday mayday mayday */
if (list_empty(&pwq->mayday_node)) {
+ /*
+ * pwqs might go away at any time, pin it until the
+ * rescuer is done with it.
+ *
+ * Especially a pwq of an unbound wq may be released
+ * before wq's destruction when the wq's attr is changed.
+ * In this case, pwq_unbound_release_workfn() may execute
+ * earlier before rescuer_thread() and corrupt wq->maydays.
+ */
+ get_pwq(pwq);
list_add_tail(&pwq->mayday_node, &wq->maydays);
wake_up_process(wq->rescuer->task);
}
@@ -2447,6 +2457,9 @@ repeat:

process_scheduled_works(rescuer);

+ /* put the reference grabbed by send_mayday(). */
+ put_pwq(pwq);
+
/*
* Leave this pool. If keep_working() is %true, notify a
* regular worker; otherwise, we end up with 0 concurrency
--
1.7.4.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/