Re: 3.4.4-rt13: btrfs + xfstests 006 = BOOM.. and a bonus rt_mutexdeadlock report for absolutely free!

From: Chris Mason
Date: Fri Jul 13 2012 - 08:55:04 EST


On Wed, Jul 11, 2012 at 11:47:40PM -0600, Mike Galbraith wrote:
> Greetings,

[ deadlocks with btrfs and the recent RT kernels ]

I talked with Thomas about this and I think the problem is the
single-reader nature of the RW rwlocks. The lockdep report below
mentions that btrfs is calling:

> [ 692.963099] [<ffffffff811fabd2>] btrfs_clear_path_blocking+0x32/0x70

In this case, the task has a number of blocking read locks on the btrfs buffers,
and we're trying to turn them back into spinning read locks. Even
though btrfs is taking the read rwlock, it doesn't think of this as a new
lock operation because we were blocking out new writers.

If the second task has taken the spinning read lock, it is going to
prevent that clear_path_blocking operation from progressing, even though
it would have worked on a non-RT kernel.

The solution should be to make the blocking read locks in btrfs honor the
single-reader semantics. This means not allowing more than one blocking
reader and not allowing a spinning reader when there is a blocking
reader. Strictly speaking btrfs shouldn't need recursive readers on a
single lock, so I wouldn't worry about that part.

There is also a chunk of code in btrfs_clear_path_blocking that makes
sure to strictly honor top down locking order during the conversion. It
only does this when lockdep is enabled because in non-RT kernels we
don't need to worry about it. For RT we'll want to enable that as well.

I'll give this a shot later today.

-chris

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/