Re: [PATCH 2/9] vfs: export do_splice_direct() to modules
From: Jan Kara
Date: Tue Mar 19 2013 - 16:25:50 EST
On Mon 18-03-13 21:53:34, Al Viro wrote:
> On Mon, Mar 18, 2013 at 04:39:36PM +0100, Jan Kara wrote:
> > IMO the deadlock is real. In freeze_super() we wait for all writers to
> > the filesystem to finish while blocking beginning of any further writes. So
> > we have a deadlock scenario like:
> > THREAD1 THREAD2 THREAD3
> > mnt_want_write() mutex_lock(&inode->i_mutex);
> > ... freeze_super()
> > block on mutex_lock(&inode->i_mutex)
> > sb_wait_write(sb, SB_FREEZE_WRITE);
> > block in sb_start_write()
> The bug is on fsfreeze side and this is not the only problem related to it.
> I've missed the implications when I applied "fs: Add freezing handling
> to mnt_want_write() / mnt_drop_write()" last June ;-/
> The thing is, until then mnt_want_write() used to be a counter; it could be
> nested. Now any such nesting is a deadlock you've just described. This
> is seriously wrong, IMO.
Well, but sb_start_write() has to be blocking (blocks when fs is frozen)
and you have to get it somewhere. It seems only natural to get the counter
from original mnt_want_write() at the same place and use one function for
that. Whether I should have changed the name from mnt_want_write() to
something else is questionable...
> BTW, having sb_start_write() buried in individual ->splice_write() is
> asking for trouble; could you describe the rules for that? E.g. where
> does it nest wrt filesystem-private locks? XFS iolock, for example...
Generally, the freeze protection should be the outermost lock taken (so
that we mitigate possibility of blocking readers when waiting for fs to
unfreeze). So it ranks above i_mutex, or XFS' ilock and iolock.
It seems that I screwed this up for ->splice_write() :-| If we are going to
move out sb_start_write() out of filesystems' hands into do_splice_from()
then we should likely do the same with ->aio_write(). Hmm?
Jan Kara <jack@xxxxxxx>
SUSE Labs, CR
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/