Re: [PATCH] hugetlbfs: change put_page/unlock_page order in hugetlbfs_fallocate()

From: Mike Kravetz
Date: Sun Aug 27 2017 - 13:16:07 EST


On 08/26/2017 12:11 PM, Nadav Amit wrote:
> hugetlfs_fallocate() currently performs put_page() before unlock_page().
> This scenario opens a small time window, from the time the page is added
> to the page cache, until it is unlocked, in which the page might be
> removed from the page-cache by another core. If the page is removed
> during this time windows, it might cause a memory corruption, as the
> wrong page will be unlocked.
>
> It is arguable whether this scenario can happen in a real system, and
> there are several mitigating factors. The issue was found by code
> inspection (actually grep), and not by actually triggering the flow.
> Yet, since putting the page before unlocking is incorrect it should be
> fixed, if only to prevent future breakage or someone copy-pasting this
> code.
>
> Fixes: 70c3547e36f5c ("hugetlbfs: add hugetlbfs_fallocate()")
>
> cc: Eric Biggers <ebiggers3@xxxxxxxxx>
> cc: Mike Kravetz <mike.kravetz@xxxxxxxxxx>
>
> Signed-off-by: Nadav Amit <namit@xxxxxxxxxx>

Thank you Nadav.

Reviewed-by: Mike Kravetz <mike.kravetz@xxxxxxxxxx>

Since hugetlbfs is an in memory filesystem, the only way one 'should' be
able to remove a page (file content) is through an inode operation such as
truncate, hole punch, or unlink. That was the basis for my response that
the inode lock would be required for page freeing.

Eric's question about sys_fadvise64(POSIX_FADV_DONTNEED) is interesting.
I was expecting to see a check for hugetlbfs pages and exit (without
modification) if encountered. A quick review of the code did not find
any such checks.

I'll take a closer look to determine exactly how hugetlbfs files are
handled. IMO, there should be something similar to the DAX check where
the routine quickly exits.
--
Mike Kravetz

> ---
> fs/hugetlbfs/inode.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
> index 28d2753be094..9475fee79cee 100644
> --- a/fs/hugetlbfs/inode.c
> +++ b/fs/hugetlbfs/inode.c
> @@ -655,11 +655,11 @@ static long hugetlbfs_fallocate(struct file *file, int mode, loff_t offset,
> mutex_unlock(&hugetlb_fault_mutex_table[hash]);
>
> /*
> - * page_put due to reference from alloc_huge_page()
> * unlock_page because locked by add_to_page_cache()
> + * page_put due to reference from alloc_huge_page()
> */
> - put_page(page);
> unlock_page(page);
> + put_page(page);
> }
>
> if (!(mode & FALLOC_FL_KEEP_SIZE) && offset + len > inode->i_size)
>