Re: [PATCH v4] mm, hugetlb: fix racy resv_huge_pages underflow on UFFDIO_COPY

From: Mike Kravetz
Date: Fri Jun 04 2021 - 17:42:14 EST


On 5/27/21 5:50 PM, Mina Almasry wrote:
> diff --git a/include/linux/migrate.h b/include/linux/migrate.h
> index 4bb4e519e3f5..4164c9ddd86e 100644
> --- a/include/linux/migrate.h
> +++ b/include/linux/migrate.h
> @@ -51,6 +51,7 @@ extern int migrate_huge_page_move_mapping(struct address_space *mapping,
> struct page *newpage, struct page *page);
> extern int migrate_page_move_mapping(struct address_space *mapping,
> struct page *newpage, struct page *page, int extra_count);
> +extern void migrate_copy_huge_page(struct page *dst, struct page *src);
> #else
>
> static inline void putback_movable_pages(struct list_head *l) {}
> @@ -77,6 +78,9 @@ static inline int migrate_huge_page_move_mapping(struct address_space *mapping,
> return -ENOSYS;
> }
>
> +static inline void migrate_copy_huge_page(struct page *dst, struct page *src)
> +{
> +}
> #endif /* CONFIG_MIGRATION */
>
> #ifdef CONFIG_COMPACTION

I am not insisting, but it might be better to make the copy routine
available under the current name 'copy_huge_page'.
Why?
There is an existing migrate_page_copy() which not only copies the page
contents, but also page state/metadata. People could get confused that
'migrate_page_copy' and 'migrate_copy_huge_page' do not have the same
functionality. Of course, as soon as you look at the routines you can
see the difference.

Again, not necessary. Just something to consider. I suspect you
changed the name to 'migrate_copy_huge_page' mostly because it resides
in migrate.c?

> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 76e2a6efc165..6072c9f82794 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -30,6 +30,7 @@
> #include <linux/numa.h>
> #include <linux/llist.h>
> #include <linux/cma.h>
> +#include <linux/migrate.h>
>
> #include <asm/page.h>
> #include <asm/pgalloc.h>
> @@ -4905,20 +4906,17 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm,
> struct page **pagep)
> {
> bool is_continue = (mode == MCOPY_ATOMIC_CONTINUE);
> - struct address_space *mapping;
> - pgoff_t idx;
> + struct hstate *h = hstate_vma(dst_vma);
> + struct address_space *mapping = dst_vma->vm_file->f_mapping;
> + pgoff_t idx = vma_hugecache_offset(h, dst_vma, dst_addr);
> unsigned long size;
> int vm_shared = dst_vma->vm_flags & VM_SHARED;
> - struct hstate *h = hstate_vma(dst_vma);
> pte_t _dst_pte;
> spinlock_t *ptl;
> - int ret;
> + int ret = -ENOMEM;
> struct page *page;
> int writable;
>
> - mapping = dst_vma->vm_file->f_mapping;
> - idx = vma_hugecache_offset(h, dst_vma, dst_addr);
> -
> if (is_continue) {
> ret = -EFAULT;
> page = find_lock_page(mapping, idx);
> @@ -4947,12 +4945,44 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm,
> /* fallback to copy_from_user outside mmap_lock */
> if (unlikely(ret)) {
> ret = -ENOENT;
> + /* Free the allocated page which may have
> + * consumed a reservation.
> + */
> + restore_reserve_on_error(h, dst_vma, dst_addr, page);
> + put_page(page);
> +
> + /* Allocate a temporary page to hold the copied
> + * contents.
> + */
> + page = alloc_huge_page_vma(h, dst_vma, dst_addr);
> + if (IS_ERR(page)) {

In v3 of the patch, alloc_migrate_huge_page was used to allocate the
temporary page and Dan Carpenter pointed out that the return value should
just be checked for NULL. I believe the same still applies to
alloc_huge_page_vma.

--
Mike Kravetz