Re: [BUG 3.13.0-rc6] reiserfs possible circular locking dependency

From: Jeff Mahoney
Date: Fri Jan 03 2014 - 17:04:57 EST


On 1/3/14, 2:46 PM, Linus Torvalds wrote:
> On Fri, Jan 3, 2014 at 11:16 AM, Knut Petersen
> <Knut_Petersen@xxxxxxxxxxx> wrote:
>> Rebooting after a power failure on an openSuSE 13.1 system
>> with kernel 3.13.0-rc6 triggered the attached lockdep warning.
>
> Hmm. It seems to be that the *normal* sequence should be:
>
> - get i_mutex, call lookup, which gets sbi->lock (reiserfs_write_lock)
>
> but in the mounting path, we have special circumstances.
>
> That finish_unfinished() function does
>
> - reiserfs_write_lock_nested() .
> - remove_save_link
> - iput(inode) with the write lock held
>
> and that can apparently end up taking i_mutex in open_xa_dir (and then
> recursively the write lock, but that's an explicitly recursive lock,
> so that part should be ok).
>
> Now, I don't think this can *really* deadlock with the normal order of
> operations, because during mounting there is no other process that can
> take those in the reverse order (since the filesystem isn't live), but
> I do wonder if we should just release the reiserfs write lock over the
> iputs. We release it in other parts anyway (like for the quota off)
>
> Jeff, you already touched this exact case in commit d2d0395fd177
> ("reiserfs: locking, release lock around quota operations") except
> that was for those quota operation cases.
>
> Even if it's not a real problem, making lockdep happy sounds like a
> good idea. Of course, the trouble is that this code path almost never
> gets exercised (which is why this hasn't been noticed earlier), so
> testing...
>
> Jeff? Comments?

If someone ever invents a time machine, I'd go back to 2004 and tell
myself to fight harder to make a reiserfs v3.7 with real extended
attribute items. This code will haunt me to my death.

Anyway, yeah. The right thing here is to drop the lock for the iput.
More than that would be ok too. finish_unfinished happens when the file
system goes read-write and that includes the remount path. There can be
other users of the file system but it would be a recursive acquire so we
wouldn't actually deadlock there.

I'll work something up over the weekend or on Monday.

-Jeff

--
Jeff Mahoney
SUSE Labs

Attachment: signature.asc
Description: OpenPGP digital signature