Re: [PATCH v2] xfs: introduce protection for drop nlink

From: Dave Chinner
Date: Tue Sep 12 2023 - 18:25:38 EST


On Mon, Sep 11, 2023 at 04:12:56PM +0800, cheng.lin130@xxxxxxxxxx wrote:
> From: Cheng Lin <cheng.lin130@xxxxxxxxxx>
>
> When abnormal drop_nlink are detected on the inode,
> shutdown filesystem, to avoid corruption propagation.
>
> Signed-off-by: Cheng Lin <cheng.lin130@xxxxxxxxxx>
> ---
> fs/xfs/xfs_fsops.c | 3 +++
> fs/xfs/xfs_inode.c | 9 +++++++++
> fs/xfs/xfs_mount.h | 1 +
> 3 files changed, 13 insertions(+)
>
> diff --git a/fs/xfs/xfs_fsops.c b/fs/xfs/xfs_fsops.c
> index 7cb75cb6b..6fc1cfe83 100644
> --- a/fs/xfs/xfs_fsops.c
> +++ b/fs/xfs/xfs_fsops.c
> @@ -543,6 +543,9 @@ xfs_do_force_shutdown(
> } else if (flags & SHUTDOWN_CORRUPT_ONDISK) {
> tag = XFS_PTAG_SHUTDOWN_CORRUPT;
> why = "Corruption of on-disk metadata";
> + } else if (flags & SHUTDOWN_CORRRUPT_ABN) {
> + tag = XFS_PTAG_SHUTDOWN_CORRUPT;
> + why = "Corruption of Abnormal conditions";

We don't need a new shutdown tag. We can consider this in-memory
corruption because we detected it in memory before it went to disk
(SHUTDOWN_CORRUPT_INCORE) or even on-disk corruption because the
reference count on disk is likely wrong at this point......

> } else if (flags & SHUTDOWN_DEVICE_REMOVED) {
> tag = XFS_PTAG_SHUTDOWN_IOERROR;
> why = "Block device removal";
> diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
> index 9e62cc500..2d41f2461 100644
> --- a/fs/xfs/xfs_inode.c
> +++ b/fs/xfs/xfs_inode.c
> @@ -919,6 +919,15 @@ xfs_droplink(
> xfs_trans_t *tp,
> xfs_inode_t *ip)
> {
> +
> + if (VFS_I(ip)->i_nlink == 0) {
> + xfs_alert(ip->i_mount,
> + "%s: Deleting inode %llu with no links.",
> + __func__, ip->i_ino);
> + xfs_force_shutdown(ip->i_mount, SHUTDOWN_CORRRUPT_ABN);
> + return -EFSCORRUPTED;
> + }
> +
> xfs_trans_ichgtime(tp, ip, XFS_ICHGTIME_CHG);
>
> drop_nlink(VFS_I(ip));

I'd kind of prefer that drop_nlink() be made to return an error on
underrun - if it's important enough to drop a warning in the log and
potentially panic the kernel, it's important enough to tell the
filesystem an underrun has occurred. But that opens a whole new can
of worms, so I think this will be fine.

Note that we don't actually need a call to shut the filesystem down.
Simply returning -EFSCORRUPTED will result in the filesystem being
shut down if the transaction is dirty when it gets cancelled due to
the droplink error.

Cheers,

Dave.
--
Dave Chinner
david@xxxxxxxxxxxxx