Re: [git pull] vfs.git

From: Al Viro
Date: Tue May 17 2016 - 02:27:23 EST


On Mon, May 16, 2016 at 08:43:33AM -0700, Linus Torvalds wrote:
> On Sun, May 15, 2016 at 8:32 PM, Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote:
> > FWIW, I considered sending that pile in several pull requests, but for some
> > reason git request-pull v4.6 vfs work.lookups spews something very odd into
> > diffstat - files that have never been touched by it and, in fact, doing
> > merge with mainline does *not* end up with those files anywhere in the
> > diff.
>
> This is "normal" if you have multiple merge bases (which in turn
> happens various ways, but all of them involved the branch having
> merges in itself, either back-merges or just merging two or more topic
> branches that had different starting points).

OK... What happened is that I started these branches off -rc1, then by -rc3
realized that PAGE_CACHE_SIZE->PAGE_SIZE stuff was going to create one hell
of merge noise and rebased work.lookups to it. I would've done the same to
work.xattr, but by then I had a never-rebase branch on top of it (cifs
xattr stuff) and that meant that work.xattr (including its initial segment
needed in work.lookups) was stuck at -rc1. Oh, well...

> > If you prefer that stuff to go in separate pulls, please say so.
>
> This time I'd really prefer it, especially that parallel lookup
> branch. That's such a big fundamental change that it definitely merits
> being merged separately rather than as part of "here's all the vfs
> changes for 4.7"..

OK. What I've got looks so:

#work.const-path - rc1-based, no conflicts with anything, independent from
everything else

#work.misc - ditto

#work.iov_iter - ditto, but with merge from #for-linus at some point.
No conflicts either (and that #for-linus had also been rc1-based).

#work.preadv2 - rc3-based, no conflicts with anything, independent from
everything else

#sendmsg.cifs - rc1-based, trims quite a bit of now-unneeded crap from cifs.
One obvious conflict with PAGE_CACHE_SIZE->PAGE_SIZE in there. Independent
from everything else.

#work.xattr - rc1-based, some starts with some acl fixes, then switches
->getxattr to passing inode and dentry separately. This is the point
where the things start to get tricky - that got merged into the very
beginning of the -rc3-based #work.lookups, to allow untangling the
security_d_instantiate() mess. That merge actually got one of the
PAGE_CACHE_SIZE conflicts resolved. #work.xattr itself proceeds to switch
a lot of filesystems to generic_...xattr(); no complications there.

#work.lookups, after that initial merge from #work.xattr does the following:
* untangle security_d_instantiate()
* convert a bunch of open-coded lookup_one_len_unlocked() to calls
of that thing; one such place (in overlayfs) actually yields a trivial
conflict with overlayfs fixes later in the cycle - overlayfs ended up
switching to a variant of lookup_one_len_unlocked() sans the permission
checks. I would've dropped that commit (it gets overridden on merge from
#ovl-fixes in #for-next; proper resolution is to use the variant in mainline
fs/overlayfs/super.c), but I didn't want to rebase the damn thing - it was
fairly late in the cycle...
* some filesystems had managed to depend on lookup/lookup exclusion
for *fs-internal* data structures in a way that would break if we relaxed the
VFS exclusion. Fixing hadn't been hard, fortunately.
* core of that series - parallel lookup machinery, replacing ->i_mutex
with rwsem, making lookup_slow() take it only shared. At that point lookups
happen in parallel; lookups on the same name wait for the in-progress one
to be done with that dentry. Surprisingly little code, at that - almost all
of it is in fs/dcache.c, with fs/namei.c changes limited to lookup_slow() -
making it use the new primitive and actually switching to locking shared.
* parallel readdir stuff - first of all, we provide the exclusion
on per-struct file basis, same as we do for read() vs. lseek() for regular
files. That takes care of most of the needed exclusion in readdir/readdir;
however, these guys are trickier than lookups, so I went for switching them
one-by-one - new method (->iterate_shared()) is added and filesystems are
switched to it as they are either confirmed to be OK with shared lock on
directory or fixed to be OK with that. I hope to kill the original method
come next cycle (almost all in-tree filesystems are switched already), but
it's still not quite finished.
* several filesystems get switched to parallel readdir. The
interesting part here is dealing with dcache preseeding by readdir; that
needs minor adjustment to be safe with directory locked only shared.
Most of the filesystems doing that got switched to in those commits.
Important exception: NFS. Turns out that NFS folks, with their, er,
insistence on VFS getting the fuck out of the way of the Smart Filesystem
Code That Knows How And What To Lock(tm) have grown the locking of their
own. Homegrown rwsem, with lookup/readdir/atomic_open being *writers*
(sillyunlink is the reader there). Of course, with VFS getting the fuck
out of the way, as requested, the actual smarts of the smart filesystem
code etc. had become exposed...
* do_last/lookup_open/atomic_open cleanups. As the result,
open() without O_CREAT locks the directory only shared. Including the
->atomic_open() case. Backmerge from #for-linus in the middle of that -
atomic_open() fix got brought in.
* then comes NFS switch to saner (VFS-based ;-) locking, killing
the homegrown "lookup and readdir are writers" kinda-sorta rwsem. All
exclusion for sillyunlink/lookup is done by the parallel lookups mechanism.
Exclusion between sillyunlink and rmdir is a real rwsem now - rmdir being
the writer. Result: NFS lookups/readdirs/O_CREAT-less opens happen in parallel
now.
* the rest of the series consists of switching a lot of filesystems
to parallel readdir; in a lot of cases ->llseek() gets simplified as well.
One backmerge in there (again, #for-linus - rockridge fix).
That's it for #work.lookups. There's one conflict in that pile (overlayfs
one mentioned above) and there's a conflict resolved in the first merge
(rc3 vs. beginning of #work.xattr).

FWIW, I think the least PITA would be if I send #work.lookups + backmerge from
#ovl-fixes to resolve the fs/overlay/super.c conflict. If you think that
it needs to done in even smaller steps (after all, it's about 3/4 of the
entire pile), please say so. Otherwise, see below:

The following changes since commit 38b78a5f18584db6fa7441e0f4531b283b0e6725:

ovl: ignore permissions on underlying lookup (2016-05-10 23:58:18 -0400)

are available in the git repository at:

git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs.git for-linus

for you to fetch changes up to 0e0162bb8c008fa7742f69d4d4982c8a37b88f95:

Merge branch 'ovl-fixes' into for-linus (2016-05-17 02:17:59 -0400)

----------------------------------------------------------------
Al Viro (75):
reiserfs_cache_default_acl(): use get_acl()
don't bother with ->d_inode->i_sb - it's always equal to ->d_sb
cifs: kill more bogus checks in ->...xattr() methods
reiserfs: switch to generic_{get,set,remove}xattr()
xattr_handler: pass dentry and inode as separate arguments of ->get()
->getxattr(): pass dentry and inode as separate arguments
Merge getxattr prototype change into work.lookups
security_d_instantiate(): move to the point prior to attaching dentry to inode
kernfs: use lookup_one_len_unlocked()
configfs_detach_prep(): make sure that wait_mutex won't go away
ocfs2: don't open-code inode_lock/inode_unlock
orangefs: don't open-code inode_lock/inode_unlock
reiserfs: open-code reiserfs_mutex_lock_safe() in reiserfs_unpack()
reconnect_one(): use lookup_one_len_unlocked()
ovl_lookup_real(): use lookup_one_len_unlocked()
make ext2_get_page() and friends work without external serialization
nfs: missing wakeup in nfs_unblock_sillyrename()
lookup_slow(): bugger off on IS_DEADDIR() from the very beginning
__d_add(): don't drop/regain ->d_lock
beginning of transition to parallel lookups - marking in-lookup dentries
parallel lookups machinery, part 2
parallel lookups machinery, part 3
parallel lookups machinery, part 4 (and last)
parallel lookups: actual switch to rwsem
give readdir(2)/getdents(2)/etc. uniform exclusion with lseek()
introduce a parallel variant of ->iterate()
proc_fill_cache(): switch to d_alloc_parallel()
proc_sys_fill_cache(): switch to d_alloc_parallel()
switch all procfs directories ->iterate_shared()
fuse: switch to ->iterate_shared()
cifs: switch to ->iterate_shared()
dcache_{readdir,dir_lseek}() users: switch to ->iterate_shared
simple local filesystems: switch to ->iterate_shared()
path_openat(): take O_PATH handling out of do_last()
lookup_open(): expand the call of vfs_create()
Merge branch 'for-linus' into work.lookups
atomic_open(): don't bother with EEXIST check - it's done in do_last()
atomic_open(): consolidate "overridden ENOENT" in open-yourself cases
atomic_open(): massage the create_error logics a bit
do_last(): get rid of duplicate ELOOP check
do_last(): take fput() on error after opening to out:
atomic_open(): delay open_to_namei_flags() until the method call
atomic_open(): be paranoid about may_open() return value
lookup_open(): lift the "fallback to !O_CREAT" logics from atomic_open()
atomic_open(): reorder and clean up a bit
lookup_open(): expand the call of real_lookup()
lookup_open(): put the dentry fed to ->lookup() or ->atomic_open() into in-lookup hash
lookup_open(): lock the parent shared unless O_CREAT is given
nfs: switch to ->iterate_shared()
nfs: per-name sillyunlink exclusion
configfs_readdir(): make safe under shared lock
kernfs: no point locking directory around that generic_file_llseek()
lustre: don't need to lock inode in directory lseek
more trivial ->iterate_shared conversions
romfs, squashfs: switch to ->iterate_shared()
fat: switch to ->iterate_shared()
9p: switch to ->iterate_shared()
Merge branch 'for-linus' into work.lookups
switch ecryptfs to ->iterate_shared
logfs: no need to lock directory in lseek
btrfs: switch to ->iterate_shared()
get_acorn_filename(): deobfuscate a bit
isofs: switch to ->iterate_shared()
befs: constify stuff a bit
befs: switch to ->iterate_shared()
afs: switch to ->iterate_shared()
f2fs: switch to ->iterate_shared()
gfs2: switch to ->iterate_shared()
hpfs: handle allocation failures in hpfs_add_pos()
hpfs: switch to ->iterate_shared()
hostfs: switch to ->iterate_shared()
hfsplus: switch to ->iterate_shared()
hfs: switch to ->iterate_shared()
ext4: switch to ->iterate_shared()
Merge branch 'ovl-fixes' into for-linus

Andreas Gruenbacher (3):
jfs: Remove unnecessary code in jfs_get_acl
posix_acl: Inode acl caching fixes
posix_acl: Unexport acl_by_type and make it static

Documentation/filesystems/porting | 53 +++
arch/alpha/kernel/osf_sys.c | 4 +-
arch/powerpc/platforms/cell/spufs/inode.c | 2 +-
block/blk-map.c | 47 +--
drivers/staging/lustre/lustre/llite/dir.c | 4 +-
.../staging/lustre/lustre/llite/llite_internal.h | 4 +-
drivers/staging/lustre/lustre/llite/xattr.c | 6 +-
fs/9p/acl.c | 8 +-
fs/9p/vfs_dir.c | 4 +-
fs/9p/vfs_inode.c | 2 +-
fs/9p/xattr.c | 4 +-
fs/affs/dir.c | 2 +-
fs/afs/dir.c | 16 +-
fs/autofs4/root.c | 4 +-
fs/bad_inode.c | 4 +-
fs/befs/befs.h | 4 +-
fs/befs/btree.c | 16 +-
fs/befs/btree.h | 4 +-
fs/befs/datastream.c | 26 +-
fs/befs/datastream.h | 11 +-
fs/befs/linuxvfs.c | 6 +-
fs/bfs/dir.c | 2 +-
fs/btrfs/acl.c | 3 -
fs/btrfs/inode.c | 2 +-
fs/btrfs/ioctl.c | 18 +-
fs/btrfs/tree-log.c | 6 +-
fs/btrfs/xattr.c | 6 +-
fs/ceph/acl.c | 2 +
fs/ceph/super.h | 2 +-
fs/ceph/xattr.c | 8 +-
fs/cifs/cifs_dfs_ref.c | 2 +-
fs/cifs/cifsfs.c | 2 +-
fs/cifs/cifsfs.h | 2 +-
fs/cifs/inode.c | 3 +-
fs/cifs/readdir.c | 57 +--
fs/cifs/xattr.c | 56 +--
fs/coda/dir.c | 18 +-
fs/compat.c | 12 +-
fs/configfs/dir.c | 37 +-
fs/configfs/inode.c | 2 +-
fs/cramfs/inode.c | 2 +-
fs/dcache.c | 267 ++++++++++++--
fs/ecryptfs/crypto.c | 5 +-
fs/ecryptfs/ecryptfs_kernel.h | 4 +-
fs/ecryptfs/file.c | 73 +++-
fs/ecryptfs/inode.c | 23 +-
fs/ecryptfs/mmap.c | 3 +-
fs/efs/dir.c | 3 +-
fs/efs/namei.c | 2 +-
fs/exofs/dir.c | 16 +-
fs/exofs/super.c | 2 +-
fs/exportfs/expfs.c | 12 +-
fs/ext2/acl.c | 3 -
fs/ext2/dir.c | 16 +-
fs/ext2/namei.c | 2 +-
fs/ext2/xattr_security.c | 6 +-
fs/ext2/xattr_trusted.c | 6 +-
fs/ext2/xattr_user.c | 8 +-
fs/ext4/acl.c | 3 -
fs/ext4/dir.c | 4 +-
fs/ext4/namei.c | 4 +-
fs/ext4/xattr_security.c | 6 +-
fs/ext4/xattr_trusted.c | 6 +-
fs/ext4/xattr_user.c | 8 +-
fs/f2fs/acl.c | 3 -
fs/f2fs/dir.c | 2 +-
fs/f2fs/namei.c | 2 +-
fs/f2fs/xattr.c | 14 +-
fs/fat/dir.c | 6 +-
fs/file.c | 5 +
fs/freevxfs/vxfs_lookup.c | 2 +-
fs/fuse/dir.c | 99 +++--
fs/gfs2/file.c | 4 +-
fs/gfs2/inode.c | 9 +-
fs/gfs2/ops_fstype.c | 4 +-
fs/gfs2/super.c | 2 +-
fs/gfs2/xattr.c | 6 +-
fs/hfs/attr.c | 5 +-
fs/hfs/catalog.c | 3 +
fs/hfs/dir.c | 12 +-
fs/hfs/hfs_fs.h | 5 +-
fs/hfs/inode.c | 2 +
fs/hfsplus/catalog.c | 3 +
fs/hfsplus/dir.c | 12 +-
fs/hfsplus/hfsplus_fs.h | 1 +
fs/hfsplus/inode.c | 1 +
fs/hfsplus/posix_acl.c | 3 -
fs/hfsplus/super.c | 1 +
fs/hfsplus/xattr.c | 10 +-
fs/hfsplus/xattr.h | 2 +-
fs/hfsplus/xattr_security.c | 6 +-
fs/hfsplus/xattr_trusted.c | 6 +-
fs/hfsplus/xattr_user.c | 6 +-
fs/hostfs/hostfs_kern.c | 2 +-
fs/hpfs/dir.c | 12 +-
fs/hpfs/dnode.c | 8 +-
fs/hpfs/hpfs_fn.h | 2 +-
fs/inode.c | 17 +-
fs/isofs/dir.c | 4 +-
fs/isofs/rock.c | 13 +-
fs/jffs2/acl.c | 2 -
fs/jffs2/dir.c | 4 +-
fs/jffs2/security.c | 6 +-
fs/jffs2/super.c | 2 +-
fs/jffs2/xattr_trusted.c | 6 +-
fs/jffs2/xattr_user.c | 6 +-
fs/jfs/acl.c | 6 -
fs/jfs/jfs_xattr.h | 2 +-
fs/jfs/namei.c | 2 +-
fs/jfs/xattr.c | 8 +-
fs/kernfs/dir.c | 17 +-
fs/kernfs/inode.c | 6 +-
fs/kernfs/kernfs-internal.h | 4 +-
fs/kernfs/mount.c | 5 +-
fs/libfs.c | 11 +-
fs/logfs/dir.c | 4 +-
fs/minix/dir.c | 2 +-
fs/namei.c | 399 ++++++++++-----------
fs/nfs/dir.c | 80 +++--
fs/nfs/direct.c | 2 +-
fs/nfs/inode.c | 4 +-
fs/nfs/nfs3acl.c | 43 ++-
fs/nfs/nfs4proc.c | 14 +-
fs/nfs/nfstrace.h | 2 +-
fs/nfs/unlink.c | 192 +++-------
fs/nfsd/nfs3proc.c | 4 +-
fs/nfsd/nfs3xdr.c | 2 +-
fs/nfsd/nfsfh.c | 2 +-
fs/nilfs2/dir.c | 16 +-
fs/nilfs2/namei.c | 2 +-
fs/ocfs2/aops.c | 4 +-
fs/ocfs2/dlmglue.c | 3 +
fs/ocfs2/file.c | 2 +-
fs/ocfs2/inode.c | 2 +-
fs/ocfs2/xattr.c | 20 +-
fs/omfs/dir.c | 2 +-
fs/open.c | 2 +-
fs/openpromfs/inode.c | 2 +-
fs/orangefs/file.c | 4 +-
fs/orangefs/orangefs-kernel.h | 4 +-
fs/orangefs/xattr.c | 10 +-
fs/overlayfs/inode.c | 4 +-
fs/overlayfs/overlayfs.h | 4 +-
fs/overlayfs/readdir.c | 4 +-
fs/overlayfs/super.c | 2 +-
fs/posix_acl.c | 116 +++---
fs/proc/base.c | 35 +-
fs/proc/fd.c | 8 +-
fs/proc/generic.c | 2 +-
fs/proc/namespaces.c | 3 +-
fs/proc/proc_net.c | 2 +-
fs/proc/proc_sysctl.c | 17 +-
fs/proc/root.c | 4 +-
fs/qnx4/dir.c | 2 +-
fs/qnx6/dir.c | 2 +-
fs/read_write.c | 12 -
fs/readdir.c | 37 +-
fs/reiserfs/dir.c | 2 +-
fs/reiserfs/file.c | 6 +-
fs/reiserfs/ioctl.c | 6 +-
fs/reiserfs/namei.c | 18 +-
fs/reiserfs/xattr.c | 54 ---
fs/reiserfs/xattr.h | 9 +-
fs/reiserfs/xattr_acl.c | 8 +-
fs/reiserfs/xattr_security.c | 19 +-
fs/reiserfs/xattr_trusted.c | 19 +-
fs/reiserfs/xattr_user.c | 19 +-
fs/romfs/super.c | 4 +-
fs/splice.c | 3 +
fs/squashfs/dir.c | 4 +-
fs/squashfs/xattr.c | 6 +-
fs/sysv/dir.c | 2 +-
fs/ubifs/dir.c | 2 +-
fs/ubifs/ubifs.h | 4 +-
fs/ubifs/xattr.c | 6 +-
fs/udf/dir.c | 2 +-
fs/udf/namei.c | 2 +-
fs/ufs/dir.c | 16 +-
fs/ufs/super.c | 2 +-
fs/xattr.c | 12 +-
fs/xfs/xfs_acl.c | 20 +-
fs/xfs/xfs_file.c | 2 +-
fs/xfs/xfs_xattr.c | 6 +-
include/linux/dcache.h | 26 +-
include/linux/file.h | 13 +
include/linux/fs.h | 51 ++-
include/linux/nfs_fs.h | 11 +-
include/linux/nfs_xdr.h | 4 +-
include/linux/posix_acl.h | 1 -
include/linux/uio.h | 1 +
include/linux/xattr.h | 5 +-
include/trace/events/ext4.h | 6 +-
kernel/audit_watch.c | 2 +-
lib/iov_iter.c | 19 +
mm/shmem.c | 9 +-
net/socket.c | 2 +-
security/commoncap.c | 6 +-
security/integrity/evm/evm_main.c | 6 +-
security/selinux/hooks.c | 11 +-
security/smack/smack_lsm.c | 6 +-
200 files changed, 1545 insertions(+), 1300 deletions(-)