Re: [PATCH 2/3] workqueue: not allow recursion run_workqueue

From: Frederic Weisbecker
Date: Thu Jan 22 2009 - 04:31:22 EST


On Thu, Jan 22, 2009 at 05:14:24PM +0800, Lai Jiangshan wrote:
> 1) lockdep will complain when recursion run_workqueue
> 2) works is not run orderly when recursion run_workqueue
>
> 3) BUG!
> We use recursion run_workqueue to hidden deadlock when
> keventd trying to flush its own queue.
>
> It's bug. When flush_workqueue()(nested in a work callback)returns,
> the workqueue is not really flushed, the sequence statement of
> this work callback will do some thing bad.
>
> So we should not allow workqueue trying to flush its own queue.
>
> Signed-off-by: Lai Jiangshan <laijs@xxxxxxxxxxxxxx>
> ---
> diff --git a/kernel/workqueue.c b/kernel/workqueue.c
> index 2f44583..1129cde 100644
> --- a/kernel/workqueue.c
> +++ b/kernel/workqueue.c
> @@ -48,8 +48,6 @@ struct cpu_workqueue_struct {
>
> struct workqueue_struct *wq;
> struct task_struct *thread;
> -
> - int run_depth; /* Detect run_workqueue() recursion depth */
> } ____cacheline_aligned;
>
> /*
> @@ -262,13 +260,6 @@ EXPORT_SYMBOL_GPL(queue_delayed_work_on);
> static void run_workqueue(struct cpu_workqueue_struct *cwq)
> {
> spin_lock_irq(&cwq->lock);
> - cwq->run_depth++;
> - if (cwq->run_depth > 3) {
> - /* morton gets to eat his hat */
> - printk("%s: recursion depth exceeded: %d\n",
> - __func__, cwq->run_depth);
> - dump_stack();
> - }
> while (!list_empty(&cwq->worklist)) {
> struct work_struct *work = list_entry(cwq->worklist.next,
> struct work_struct, entry);
> @@ -311,7 +302,6 @@ static void run_workqueue(struct cpu_workqueue_struct *cwq)
> spin_lock_irq(&cwq->lock);
> cwq->current_work = NULL;
> }
> - cwq->run_depth--;
> spin_unlock_irq(&cwq->lock);
> }
>
> @@ -368,29 +358,20 @@ static void insert_wq_barrier(struct cpu_workqueue_struct *cwq,
>
> static int flush_cpu_workqueue(struct cpu_workqueue_struct *cwq)
> {
> - int active;
> + int active = 0;
> + struct wq_barrier barr;
>
> - if (cwq->thread == current) {
> - /*
> - * Probably keventd trying to flush its own queue. So simply run
> - * it by hand rather than deadlocking.
> - */
> - run_workqueue(cwq);
> - active = 1;
> - } else {
> - struct wq_barrier barr;
> + BUG_ON(cwq->thread == current);

Hi Lai,

BUG_ON seems perhaps a bit too much for such case. The system
will run in an endless loop because of a mistake that will not have
necessarily a fatal end.
WARN_ON should be enough (plus the warn that lockdep will raise
too in this case).

Thanks.
Frederic.


> - active = 0;
> - spin_lock_irq(&cwq->lock);
> - if (!list_empty(&cwq->worklist) || cwq->current_work != NULL) {
> - insert_wq_barrier(cwq, &barr, &cwq->worklist);
> - active = 1;
> - }
> - spin_unlock_irq(&cwq->lock);
> -
> - if (active)
> - wait_for_completion(&barr.done);
> + spin_lock_irq(&cwq->lock);
> + if (!list_empty(&cwq->worklist) || cwq->current_work != NULL) {
> + insert_wq_barrier(cwq, &barr, &cwq->worklist);
> + active = 1;
> }
> + spin_unlock_irq(&cwq->lock);
> +
> + if (active)
> + wait_for_completion(&barr.done);
>
> return active;
> }
>

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/