Re: [REPOST PATCH v4 4/5] kernfs: use i_lock to protect concurrent inode updates

From: Miklos Szeredi
Date: Tue Jun 01 2021 - 09:19:14 EST


On Fri, 28 May 2021 at 08:34, Ian Kent <raven@xxxxxxxxxx> wrote:
>
> The inode operations .permission() and .getattr() use the kernfs node
> write lock but all that's needed is to keep the rb tree stable while
> updating the inode attributes as well as protecting the update itself
> against concurrent changes.
>
> And .permission() is called frequently during path walks and can cause
> quite a bit of contention between kernfs node operations and path
> walks when the number of concurrent walks is high.
>
> To change kernfs_iop_getattr() and kernfs_iop_permission() to take
> the rw sem read lock instead of the write lock an additional lock is
> needed to protect against multiple processes concurrently updating
> the inode attributes and link count in kernfs_refresh_inode().
>
> The inode i_lock seems like the sensible thing to use to protect these
> inode attribute updates so use it in kernfs_refresh_inode().
>
> Signed-off-by: Ian Kent <raven@xxxxxxxxxx>
> ---
> fs/kernfs/inode.c | 10 ++++++----
> fs/kernfs/mount.c | 4 ++--
> 2 files changed, 8 insertions(+), 6 deletions(-)
>
> diff --git a/fs/kernfs/inode.c b/fs/kernfs/inode.c
> index 3b01e9e61f14e..6728ecd81eb37 100644
> --- a/fs/kernfs/inode.c
> +++ b/fs/kernfs/inode.c
> @@ -172,6 +172,7 @@ static void kernfs_refresh_inode(struct kernfs_node *kn, struct inode *inode)
> {
> struct kernfs_iattrs *attrs = kn->iattr;
>
> + spin_lock(&inode->i_lock);
> inode->i_mode = kn->mode;
> if (attrs)
> /*
> @@ -182,6 +183,7 @@ static void kernfs_refresh_inode(struct kernfs_node *kn, struct inode *inode)
>
> if (kernfs_type(kn) == KERNFS_DIR)
> set_nlink(inode, kn->dir.subdirs + 2);
> + spin_unlock(&inode->i_lock);
> }
>
> int kernfs_iop_getattr(struct user_namespace *mnt_userns,
> @@ -191,9 +193,9 @@ int kernfs_iop_getattr(struct user_namespace *mnt_userns,
> struct inode *inode = d_inode(path->dentry);
> struct kernfs_node *kn = inode->i_private;
>
> - down_write(&kernfs_rwsem);
> + down_read(&kernfs_rwsem);
> kernfs_refresh_inode(kn, inode);
> - up_write(&kernfs_rwsem);
> + up_read(&kernfs_rwsem);
>
> generic_fillattr(&init_user_ns, inode, stat);
> return 0;
> @@ -284,9 +286,9 @@ int kernfs_iop_permission(struct user_namespace *mnt_userns,
>
> kn = inode->i_private;
>
> - down_write(&kernfs_rwsem);
> + down_read(&kernfs_rwsem);
> kernfs_refresh_inode(kn, inode);
> - up_write(&kernfs_rwsem);
> + up_read(&kernfs_rwsem);
>
> return generic_permission(&init_user_ns, inode, mask);
> }
> diff --git a/fs/kernfs/mount.c b/fs/kernfs/mount.c
> index baa4155ba2edf..f2f909d09f522 100644
> --- a/fs/kernfs/mount.c
> +++ b/fs/kernfs/mount.c
> @@ -255,9 +255,9 @@ static int kernfs_fill_super(struct super_block *sb, struct kernfs_fs_context *k
> sb->s_shrink.seeks = 0;
>
> /* get root inode, initialize and unlock it */
> - down_write(&kernfs_rwsem);
> + down_read(&kernfs_rwsem);
> inode = kernfs_get_inode(sb, info->root->kn);
> - up_write(&kernfs_rwsem);
> + up_read(&kernfs_rwsem);
> if (!inode) {
> pr_debug("kernfs: could not get root inode\n");
> return -ENOMEM;
>

This last hunk is not mentioned in the patch header. Why is this needed?

Otherwise looks good.

Thanks,
Miklos