[PATCH] workqueue: lock cwq access in drain_workqueue

From: Thomas Tuttle
Date: Fri Sep 09 2011 - 11:22:29 EST


Take cwq->gcwq->lock to avoid racing between drain_workqueue checking
to make sure the workqueues are empty and cwq_dec_nr_in_flight
decrementing and then incrementing nr_active when it activates a
delayed work.

We discovered this when a corner case in one of our drivers resulted in
us trying to destroy a workqueue in which the remaining work would
always requeue itself again in the same workqueue. We would hit this
race condition and trip the BUG_ON on workqueue.c:3080.

Patch is against HEAD as of Fri Sep 9 15:16:09 UTC 2011
(e4e436e0bd480668834fe6849a52c5397b7be4fb).

Signed-off-by: Thomas Tuttle <ttuttle@xxxxxxxxxxxx>
---
kernel/workqueue.c | 8 +++++++-
1 files changed, 7 insertions(+), 1 deletions(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 25fb1b0..d610ced 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -2412,8 +2412,14 @@ reflush:

for_each_cwq_cpu(cpu, wq) {
struct cpu_workqueue_struct *cwq = get_cwq(cpu, wq);
+ int cwq_flushed;

- if (!cwq->nr_active && list_empty(&cwq->delayed_works))
+ spin_lock_irq(&cwq->gcwq->lock);
+ cwq_flushed = !cwq->nr_active
+ && list_empty(&cwq->delayed_works);
+ spin_unlock_irq(&cwq->gcwq->lock);
+
+ if (cwq_flushed)
continue;

if (++flush_cnt == 10 ||
--
1.7.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/