Problem with freezable workqueues

From: Rafael J. Wysocki
Date: Tue Feb 27 2007 - 16:49:56 EST


Hi,

We have a problem with freezable workqueues in 2.6.21-rc1 and in -mm
(there are only two of them, in XFS, but still). Namely, their worker threads
deadlock with workqueue_cpu_callback() that gets called during the CPU hotplug,
becuase workqueue_cpu_callback() tries to stop these threads while they are
frozen (disable_nonboot_cpus() happens after we've frozen tasks).

For 2.6.21-rc1 I've invented the appended workaround (works for me, waiting for
Johannes to confirm it works for him too), but I think we need something better
for -mm and future kernels.

Greetings,
Rafael


kernel/workqueue.c | 18 ++++++++++++++++--
1 file changed, 16 insertions(+), 2 deletions(-)

Index: linux-2.6.21-rc1/kernel/workqueue.c
===================================================================
--- linux-2.6.21-rc1.orig/kernel/workqueue.c 2007-02-24 10:17:57.000000000 +0100
+++ linux-2.6.21-rc1/kernel/workqueue.c 2007-02-24 20:00:22.000000000 +0100
@@ -376,8 +376,19 @@ static int worker_thread(void *__cwq)

set_current_state(TASK_INTERRUPTIBLE);
while (!kthread_should_stop()) {
- if (cwq->freezeable)
- try_to_freeze();
+ if (try_to_freeze()) {
+ /* We've just left the refrigerator. If our CPU is
+ * a nonboot one, we might have been replaced.
+ * The lock is taken to prevent the race with
+ * cleanup_workqueue_thread() from happening
+ */
+ spin_lock_irq(&cwq->lock);
+ if (cwq->thread != current) {
+ spin_unlock_irq(&cwq->lock);
+ break;
+ }
+ spin_unlock_irq(&cwq->lock);
+ }

add_wait_queue(&cwq->more_work, &wait);
if (list_empty(&cwq->worklist))
@@ -539,6 +550,9 @@ static void cleanup_workqueue_thread(str
cwq = per_cpu_ptr(wq->cpu_wq, cpu);
spin_lock_irqsave(&cwq->lock, flags);
p = cwq->thread;
+ if (p && frozen(p))
+ p = NULL;
+
cwq->thread = NULL;
spin_unlock_irqrestore(&cwq->lock, flags);
if (p)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/