Re: [PATCH 2/2] xfs: make sure link path does not go away at access
From: Dave Chinner
Date: Thu Nov 11 2021 - 19:33:04 EST
On Thu, Nov 11, 2021 at 11:39:30AM +0800, Ian Kent wrote:
> When following a trailing symlink in rcu-walk mode it's possible to
> succeed in getting the ->get_link() method pointer but the link path
> string be deallocated while it's being used.
>
> Utilize the rcu mechanism to mitigate this risk.
>
> Suggested-by: Miklos Szeredi <miklos@xxxxxxxxxx>
> Signed-off-by: Ian Kent <raven@xxxxxxxxxx>
> ---
> fs/xfs/kmem.h | 4 ++++
> fs/xfs/xfs_inode.c | 4 ++--
> fs/xfs/xfs_iops.c | 10 ++++++++--
> 3 files changed, 14 insertions(+), 4 deletions(-)
>
> diff --git a/fs/xfs/kmem.h b/fs/xfs/kmem.h
> index 54da6d717a06..c1bd1103b340 100644
> --- a/fs/xfs/kmem.h
> +++ b/fs/xfs/kmem.h
> @@ -61,6 +61,10 @@ static inline void kmem_free(const void *ptr)
> {
> kvfree(ptr);
> }
> +static inline void kmem_free_rcu(const void *ptr)
> +{
> + kvfree_rcu(ptr);
> +}
>
>
> static inline void *
> diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
> index a4f6f034fb81..aaa1911e61ed 100644
> --- a/fs/xfs/xfs_inode.c
> +++ b/fs/xfs/xfs_inode.c
> @@ -2650,8 +2650,8 @@ xfs_ifree(
> * already been freed by xfs_attr_inactive.
> */
> if (ip->i_df.if_format == XFS_DINODE_FMT_LOCAL) {
> - kmem_free(ip->i_df.if_u1.if_data);
> - ip->i_df.if_u1.if_data = NULL;
> + kmem_free_rcu(ip->i_df.if_u1.if_data);
> + RCU_INIT_POINTER(ip->i_df.if_u1.if_data, NULL);
> ip->i_df.if_bytes = 0;
> }
How do we get here in a way that the VFS will walk into this inode
during a lookup?
I mean, the dentry has to be validated and held during the RCU path
walk, so if we are running a transaction to mark the inode as free
here it has already been unlinked and the dentry turned
negative. So anything that is doing a lockless pathwalk onto that
dentry *should* see that it is a negative dentry at this point and
hence nothing should be walking any further or trying to access the
link that was shared from ->get_link().
AFAICT, that's what the sequence check bug you fixed in the previous
patch guarantees. It makes no difference if the unlinked inode has
been recycled or not, the lookup race condition is the same in that
the inode has gone through ->destroy_inode and is now owned by the
filesystem and not the VFS.
Otherwise, it might just be best to memset the buffer to zero here
rather than free it, and leave it to be freed when the inode is
freed from the RCU callback in xfs_inode_free_callback() as per
normal.
Cheers,
Dave.
--
Dave Chinner
david@xxxxxxxxxxxxx