Re: [patch] stop inotify from sending random DELETE_SELF event under load

From: Mark Fasheh
Date: Wed Sep 21 2005 - 13:10:46 EST


On Wed, Sep 21, 2005 at 07:45:38AM -0700, Joel Becker wrote:
> On Wed, Sep 21, 2005 at 10:17:38AM +0100, Christoph Hellwig wrote:
> > The real fix would be to put an equivalent of OCFS2_INODE_MAYBE_ORPHANED
> > into struct inode. That way it could be shared by other clustered
> > filesystems aswell, and OCFS2 had no need to implement ->drop_inode.
> Or change the VFS patterns to allow lookup and validation of the
> inode before choosing the generic_drop/generic_delete path.
Right. *Only* setting something akin to OCFS2_INODE_MAYBE_ORPHANED on struct
inode would give us essentially what we have today assuming that
generic_drop_inode() punted to delete_inode() in that case (as well as nlink
== 0). Which is fine as long as we're ok with inodes who might *not*
actually orphaned going through delete_inode().

A more involved solution might be to give OCFS2 the chance to do it's
cluster querying at or around drop_inode() time on those which have been
marked as potentially orphaned.

This would give us the benefit of allowing those inodes which wound up not
being orphaned (due to remote node error or death) to go through
generic_forget_inode(), which if my reading is correct would at least allow
it to live longer in the system. I'm not so sure how important this is
however, considering the number of inodes which wind up not having been
orphaned is typically very small and the only downside for those few as far
as i can tell is a small loss of performance.

Btw, the other reason a local node might not completely wipe an inode in
OCFS2 land is if another node is still using it (if it has open file
descriptors), in which case what we do today is sync the inode, truncate
it's pages and then exit ocfs2_delete_inode(), thus allowing our local VFS
to destroy it secure in the knowledge that the actual wiping of the inode
will happen elsewhere in the cluster.
--Mark

--
Mark Fasheh
Senior Software Developer, Oracle
mark.fasheh@xxxxxxxxxx
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/