Re: [PATCH v6 07/26] fs/dax: Ensure all pages are idle prior to filesystem unmount
From: Dan Williams
Date: Mon Jan 13 2025 - 18:43:12 EST
Alistair Popple wrote:
> File systems call dax_break_mapping() prior to reallocating file
> system blocks to ensure the page is not undergoing any DMA or other
> accesses. Generally this is needed when a file is truncated to ensure
> that if a block is reallocated nothing is writing to it. However
> filesystems currently don't call this when an FS DAX inode is evicted.
>
> This can cause problems when the file system is unmounted as a page
> can continue to be under going DMA or other remote access after
> unmount. This means if the file system is remounted any truncate or
> other operation which requires the underlying file system block to be
> freed will not wait for the remote access to complete. Therefore a
> busy block may be reallocated to a new file leading to corruption.
>
> Signed-off-by: Alistair Popple <apopple@xxxxxxxxxx>
>
> ---
>
> Changes for v5:
>
> - Don't wait for pages to be idle in non-DAX mappings
> ---
> fs/dax.c | 29 +++++++++++++++++++++++++++++
> fs/ext4/inode.c | 32 ++++++++++++++------------------
> fs/xfs/xfs_inode.c | 9 +++++++++
> fs/xfs/xfs_inode.h | 1 +
> fs/xfs/xfs_super.c | 18 ++++++++++++++++++
> include/linux/dax.h | 2 ++
> 6 files changed, 73 insertions(+), 18 deletions(-)
>
> diff --git a/fs/dax.c b/fs/dax.c
> index 7008a73..4e49cc4 100644
> --- a/fs/dax.c
> +++ b/fs/dax.c
> @@ -883,6 +883,14 @@ static int wait_page_idle(struct page *page,
> TASK_INTERRUPTIBLE, 0, 0, cb(inode));
> }
>
> +static void wait_page_idle_uninterruptible(struct page *page,
> + void (cb)(struct inode *),
> + struct inode *inode)
> +{
> + ___wait_var_event(page, page_ref_count(page) == 1,
> + TASK_UNINTERRUPTIBLE, 0, 0, cb(inode));
> +}
> +
> /*
> * Unmaps the inode and waits for any DMA to complete prior to deleting the
> * DAX mapping entries for the range.
> @@ -911,6 +919,27 @@ int dax_break_mapping(struct inode *inode, loff_t start, loff_t end,
> }
> EXPORT_SYMBOL_GPL(dax_break_mapping);
>
> +void dax_break_mapping_uninterruptible(struct inode *inode,
> + void (cb)(struct inode *))
> +{
> + struct page *page;
> +
> + if (!dax_mapping(inode->i_mapping))
> + return;
> +
> + do {
> + page = dax_layout_busy_page_range(inode->i_mapping, 0,
> + LLONG_MAX);
> + if (!page)
> + break;
> +
> + wait_page_idle_uninterruptible(page, cb, inode);
> + } while (true);
> +
> + dax_delete_mapping_range(inode->i_mapping, 0, LLONG_MAX);
> +}
> +EXPORT_SYMBOL_GPL(dax_break_mapping_uninterruptible);
Riffing off of Darrick's feedback, how about call this
dax_break_layout_final()?