Re: dcache: remove trylock loops (was Re: [BUG] lock_parent() breakage when used from shrink_dentry_list())
From: Eric W. Biederman
Date: Tue Mar 13 2018 - 00:04:06 EST
Al Viro <viro@xxxxxxxxxxxxxxxxxx> writes:
> On Tue, Mar 13, 2018 at 12:37:51AM +0000, Al Viro wrote:
>> On Mon, Mar 12, 2018 at 06:52:31PM -0500, Eric W. Biederman wrote:
>>
>> > Ah. I see now there is now the s_roots list that handles
>> > that bit of strangeness.
>> >
>> > So one path is to simply remove the heuristic from
>> > path_connected.
>> >
>> > Another path is to have nfsv2 and nfsv3 not set s_root at all.
>> > Leaving the heuristic working for the rest of the filesystems,
>> > and generally simplifying the code.
>> >
>> > Something like the diff below I should think.
>>
>> > + /* Leave nfsv2 and nfsv3 s_root == NULL */
>>
>> Now, grep fs/super.c for s_root. Or try to boot it, for that
>> matter...
>
> BTW, if rename happens on server and we step into directory
> we'd already seen in one subtree while doing a lookup in
> another, we will get it moved around. Without having the
> subtrees ever connected in dcache on client. So adding
> && IS_ROOT(sb->s_root) to the test also won't work.
Nope. We fundamentally need to call is_subdir in the nfs case
to ensure we don't have crazy problems.
I believe below is the obviously correct fix (that still preserves some
caching). I need to look at nilfs as it also calls d_obtain_root. I am
also wondering if some of the other network filesystems might be
susceptible to problems caused by renames on the server.
It is tempting to be more clever and not consider NFS_MOUNT_UNSHARED
mounts or mounts without mulitple s_roots but there can be renames
on the server that should trip up even those cases. At least if
anything figures out how to trigger the dentries in the path to ever get
revalidated.
Eric
diff --git a/fs/namei.c b/fs/namei.c
index 921ae32dbc80..cafa365eeb70 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -559,9 +559,10 @@ static int __nd_alloc_stack(struct nameidata *nd)
static bool path_connected(const struct path *path)
{
struct vfsmount *mnt = path->mnt;
+ struct super_block *sb = mnt->mnt_sb;
- /* Only bind mounts can have disconnected paths */
- if (mnt->mnt_root == mnt->mnt_sb->s_root)
+ /* Bind mounts and multi-root filesystems can have disconnected paths */
+ if (!(sb->s_iflags & SB_I_MULTIROOT) && (mnt->mnt_root == sb->s_root))
return true;
return is_subdir(path->dentry, mnt->mnt_root);
diff --git a/fs/nfs/super.c b/fs/nfs/super.c
index 29bacdc56f6a..64129a72f312 100644
--- a/fs/nfs/super.c
+++ b/fs/nfs/super.c
@@ -2631,6 +2631,7 @@ struct dentry *nfs_fs_mount_common(struct nfs_server *server,
/* initial superblock/root creation */
mount_info->fill_super(s, mount_info);
nfs_get_cache_cookie(s, mount_info->parsed, mount_info->cloned);
+ s->s_iflags |= SB_I_MULTIROOT;
}
mntroot = nfs_get_root(s, mount_info->mntfh, dev_name);
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 2a815560fda0..0430e03febaa 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1317,6 +1317,7 @@ extern int send_sigurg(struct fown_struct *fown);
#define SB_I_CGROUPWB 0x00000001 /* cgroup-aware writeback enabled */
#define SB_I_NOEXEC 0x00000002 /* Ignore executables on this fs */
#define SB_I_NODEV 0x00000004 /* Ignore devices on this fs */
+#define SB_I_MULTIROOT 0x00000008 /* Multiple roots to the dentry tree */
/* sb->s_iflags to limit user namespace mounts */
#define SB_I_USERNS_VISIBLE 0x00000010 /* fstype already mounted */