On Mon, Jul 14, 2008 at 05:34:51PM +1000, Lachlan McIlroy wrote:Why do we need to try-lock the inode? Will we have an ABBA deadlockMikael Abrahamsson wrote:On Fri, 11 Jul 2008, Dave Chinner wrote:xfs_ilock(ip, XFS_IOLOCK_SHARED);
That aside, what was the assert failure reported prior to the oops? i.e. paste the lines in the log before the ---[ cut here ]--- line? One of them will start with 'Assertion failed:', I think....These ones?
Jul 8 04:44:56 via kernel: [554197.888008] Assertion failed: whichfork == XFS_ATTR_FORK || ip->i_delayed_blks == 0, file: fs/xfs/xfs_bmap.c, line: 5879
Jul 9 03:25:21 via kernel: [42940.748007] Assertion failed: whichfork == XFS_ATTR_FORK || ip->i_delayed_blks == 0, file: fs/xfs/xfs_bmap.c, line: 5879
if (whichfork == XFS_DATA_FORK &&
(ip->i_delayed_blks || ip->i_size > ip->i_d.di_size)) {
/* xfs_fsize_t last_byte = xfs_file_last_byte(ip); */
error = xfs_flush_pages(ip, (xfs_off_t)0,
-1, 0, FI_REMAPF);
if (error) {
xfs_iunlock(ip, XFS_IOLOCK_SHARED);
return error;
}
}
ASSERT(whichfork == XFS_ATTR_FORK || ip->i_delayed_blks == 0);
This is a race between xfs_fsr and a mmap write. xfs_fsr acquires the
iolock and then flushes the file and because it has the iolock it doesn't
expect any new delayed allocations to occur. A mmap write can allocate
delayed allocations without acquiring the iolock so is able to get in
after the flush but before the ASSERT.
Christoph and I were contemplating this problem with ->page_mkwrite
reecently. The problem is that we can't, right now, return an
EAGAIN-like error to ->page_mkwrite() and have it retry the
page fault. Other parts of the page faulting code can do this,
so it seems like a solvable problem.
The basic concept is that if we can return a EAGAIN result we can
try-lock the inode and hold the locks necessary to avoid this race
or prevent the page fault from dirtying the page until the
filesystem is unfrozen.