Race between generic_forget_inode() and sync_sb_inodes()?

From: David Chinner
Date: Thu Nov 29 2007 - 16:57:00 EST



If we are in the process of dropping an inode and it is hashed,
generic_forget_inode() will mark it I_WILL_FREE and drop the
inode_lock before calling write_inode_now().

However, at this point, the inode is still on the sb->s_dirty_list
so sync_sb_inodes() could see it and try to write it back.
i.e.:

generic_forget_inode sync_sb_inodes

i_state |= I_WILL_FREE
spin_unlock(inode_lock)
write_inode_now()
spin_lock(inode_lock)
__iget(inode)
__writeback_single_inode()
spin_unlock(inode_lock)
spin_lock(inode_lock)
i_state &= ~I_WILL_FREE
(remove from lists)
i_state |= I_FREEING
spin_unlock(inode_lock)
i_state = I_CLEAR
spin_lock(inode_lock)
iput(inode)
BUG_ON(i_state == I_CLEAR)
(inode gets freed)


I came across this because I've been making changes to XFS to avoid the
inode hash, and I've found that I need to remove the inode from the
dirty list when setting I_WILL_FREE to avoid this race. I can't see
how this race is avoided when inodes are hashed, so I'm wondering
if we've just been lucky or there's something that I'm missing that
means the above does not occur.

If it really is a race, then sync_sb_inodes() should really check
I_WILL_FREE before doing an __iget(), I think. If I_WILL_FREE is set
we are already doing writeback so we don't need to do it from
sync_sb_inodes(), right?

Cheers,

Dave.
--
Dave Chinner
Principal Engineer
SGI Australian Software Group
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/