Re: Deadlocks due to per-process plugging

From: Mike Galbraith
Date: Wed Jul 18 2012 - 00:44:59 EST


(adds rather important missing Cc)

On Tue, 2012-07-17 at 15:10 +0200, Mike Galbraith wrote:
> On Mon, 2012-07-16 at 12:19 +0200, Thomas Gleixner wrote:
>
> > > @@ -647,8 +648,11 @@ static inline void rt_spin_lock_fastlock
> > >
> > > if (likely(rt_mutex_cmpxchg(lock, NULL, current)))
> > > rt_mutex_deadlock_account_lock(lock, current);
> > > - else
> > > + else {
> > > + if (blk_needs_flush_plug(current))
> > > + blk_schedule_flush_plug(current);
> > > slowfn(lock);
> > > + }
> >
> > That should do the trick.
>
> Box has been grinding away long enough now to agree that it did.
>
> rt: pull your plug before blocking

Hm. x3550 seems to have lost interest in nearly instant gratification
ext4 deadlock testcase: taskset -c 3 dbench -t 30 -s 8 in enterprise.
Previously, it _might_ have survived one 30 second test, but never for
minutes, much less several minutes of very many threads, so it appears
to have been another flavor of IO dependency deadlock.

I just tried virgin 3.4.4-rt13, and it too happily churned away.. until
I tried dbench -t 300 -s 500 that is. That (seemingly 100% repeatably)
makes rcu stall that doesn't get to serial console, nor will my virgin
source/config setup crash dump. Damn. Enterprise kernel will dump, but
won't stall, so I guess I'd better check out the other virgin 3.x-rt
trees to at least narrow down where stall started.

Whatever, RCU stall is a different problem. Revert unplug patchlet, and
ext4 deadlock is back in virgin 3.4-rt, so methinks it's sufficiently
verified that either we need some form of unplug before blocking, or we
need a pull your plug point is at least two filesystems, maybe more.

-Mike

The patch in question for missing Cc. Maybe should be only mutex, but I
see no reason why IO dependency can only possibly exist for mutexes...

rt: pull your plug before blocking

Queued IO can lead to IO deadlock should a task require wakeup from as task
which is blocked on that queued IO.

ext3: dbench1 queues a buffer, blocks on journal mutex, it's plug is not
pulled. dbench2 mutex owner is waiting for kjournald, who is waiting for
the buffer queued by dbench1. Game over.

Signed-off-by: Mike Galbraith <efault@xxxxxx>

diff --git a/kernel/rtmutex.c b/kernel/rtmutex.c
index 3bff726..3f6ae32 100644
--- a/kernel/rtmutex.c
+++ b/kernel/rtmutex.c
@@ -20,6 +20,7 @@
#include <linux/export.h>
#include <linux/sched.h>
#include <linux/timer.h>
+#include <linux/blkdev.h>

#include "rtmutex_common.h"

@@ -647,8 +648,11 @@ static inline void rt_spin_lock_fastlock(struct rt_mutex *lock,

if (likely(rt_mutex_cmpxchg(lock, NULL, current)))
rt_mutex_deadlock_account_lock(lock, current);
- else
+ else {
+ if (blk_needs_flush_plug(current))
+ blk_schedule_flush_plug(current);
slowfn(lock);
+ }
}

static inline void rt_spin_lock_fastunlock(struct rt_mutex *lock,
@@ -1104,8 +1108,11 @@ rt_mutex_fastlock(struct rt_mutex *lock, int state,
if (!detect_deadlock && likely(rt_mutex_cmpxchg(lock, NULL, current))) {
rt_mutex_deadlock_account_lock(lock, current);
return 0;
- } else
+ } else {
+ if (blk_needs_flush_plug(current))
+ blk_schedule_flush_plug(current);
return slowfn(lock, state, NULL, detect_deadlock);
+ }
}

static inline int



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/