Re: 2.6.18.4: flush_workqueue calls mutex_lock in interruptenvironment

From: Trond Myklebust
Date: Wed Dec 13 2006 - 07:41:59 EST


On Wed, 2006-12-13 at 09:02 +0100, Arjan van de Ven wrote:
> On Wed, 2006-12-13 at 08:25 +0100, xb wrote:
> > Hi all,
> >
> > Running some IO stress tests on a 8*ways IA64 platform, we got:
> > BUG: warning at kernel/mutex.c:132/__mutex_lock_common() message
> > followed by:
> > Unable to handle kernel paging request at virtual address
> > 0000000000200200
> > oops corresponding to anon_vma_unlink() calling list_del() on a
> > poisonned list.
> >
> > Having a look to the stack, we see that flush_workqueue() calls
> > mutex_lock() with softirqs disabled.
>
> something is wrong here... flush_workqueue() is a sleeping function and
> is not allowed to be called in such a context!

It seems utterly insane to have aio_complete() flush a workqueue. That
function has to be called from a number of different environments,
including non-sleep tolerant environments.

For instance it means that directIO on NFS will now cause the rpciod
workqueues to call flush_workqueue(aio_wq), thus slowing down all RPC
activity.

Trond

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/