[RFC] st_nlink after rmdir() and rename()

From: Al Viro
Date: Wed Mar 02 2011 - 22:25:01 EST


We have an interesting problem. Consider the following sequence
of syscalls:
mkdir("foo", 0777);
mkdir("bar", 0777);
fd1 = open("foo", O_DIRECTORY);
fd2 = open("bar", O_DIRECTORY);
rename("foo", "bar"); /* kill old bar */
rmdir("bar"); /* kill old foo */
fstat(fd1, &buf1);
fstat(fd2, &buf2);
What should be in buf1.st_nlink and buf2.st_nlink, if none of these
syscalls fail? Note that in both cases any lookups in victim directory
will fail and so will readdir; as far as VFS is concerned, the effect of
such rmdir() and rename() on their victims are identical. In particular,
both . and .. are gone, as explicitly required by POSIX in case of rmdir().

Surprisingly, the results are *NOT* identical wrt fstat(); for most of
the filesystems we will get 0 in both cases (as expected), but some will
leave 1 in buf2.st_nlink. What we have is

0 0: ext*, xfs, jfs, reiserfs, ocfs2, gfs2, nilfs, exofs, udf, ubifs,
minix, sysv, ufs, msdos, vfat, hfs+
0 1: ramfs, shmem, hugetlbfs, jffs2, omfs, hfs[*], apparently nfs as well
hell knows: ncpfs, fuse, ecryptfs, coda, cifs, ceph, btrfs, affs
1 1: (unless I'm misreading it) logfs
completely FUBAR wrt fstat(): hostfs[**]
-EEXIST on rename(): cgroup
-EINVAL on rename(): hpfs
server ought to fail such rename(): 9p
apparently fails rename(): smbfs
completely broken rename(): pohmelfs[***]

[*] yes, different from hfs+; the code is clearly broken, since it simply
does unlink() on target, without even verifying that it's empty. And
yes, it's trivial fs corruption...
[**] even open() + unlink() + fstat() will report original st_nlink, etc.
[***] new_dir is target's _parent_ directory and it's not required to be
empty; it's never going to be NULL, while we are at it. Code makes no
sense...

The variant with st_nlink getting to 0 in both cases is definitely the most
common and at least for local filesystems I think it should be mandatory.
I.e. ramfs and friends, jffs2, omfs and hfs should all switch to it.

Comments would be welcome; I really don't know the protocols of most of
the network filesystems well enough to tell what'll happen in these
situations.
Al, digging through i_nlink code audit...
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/