Re: [PATCH] core: workqueue: BUG_ON on workqueue recursion
From: Oleg Nesterov
Date: Wed Feb 03 2010 - 14:47:31 EST
On 02/03, Simon Kagstrom wrote:
>
> When the workqueue is flushed from workqueue context (recursively), the
> system enters a strange state where things at random (dependent on the
> global workqueue) start misbehaving. For example, for us the console and
> logins locks up while the web server continues running.
>
> Since the system becomes unstable, change this to a BUG_ON instead.
I agree with this patch. We are going to deadlock anyway, if the
condition is true the caller is cwq->current_work, this means
flush_cpu_workqueue() will insert the barrier and hang.
However,
> @@ -482,7 +482,7 @@ static int flush_cpu_workqueue(struct cpu_workqueue_struct *cwq)
> int active = 0;
> struct wq_barrier barr;
>
> - WARN_ON(cwq->thread == current);
> + BUG_ON(cwq->thread == current);
Another option is change the code to do
if (WARN_ON(cwq->thread == current))
return;
This gives the kernel chance to survive after the warning.
What do you think?
Oleg.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/