Re: [PATCH v3] mm, hugetlb: fix resv_huge_pages underflow on UFFDIO_COPY
From: Andrew Morton
Date: Sat May 22 2021 - 17:19:50 EST
On Fri, 21 May 2021 00:44:33 -0700 Mina Almasry <almasrymina@xxxxxxxxxx> wrote:
> The userfaultfd hugetlb tests detect a resv_huge_pages underflow. This
> happens when hugetlb_mcopy_atomic_pte() is called with !is_continue on
> an index for which we already have a page in the cache. When this
> happens, we allocate a second page, double consuming the reservation,
> and then fail to insert the page into the cache and return -EEXIST.
>
> To fix this, we first if there exists a page in the cache which already
^ check
> consumed the reservation, and return -EEXIST immediately if so.
>
> Secondly, if we fail to copy the page contents while holding the
> hugetlb_fault_mutex, we will drop the mutex and return to the caller
> after allocating a page that consumed a reservation. In this case there
> may be a fault that double consumes the reservation. To handle this, we
> free the allocated page, fix the reservations, and allocate a temporary
> hugetlb page and return that to the caller. When the caller does the
> copy outside of the lock, we again check the cache, and allocate a page
> consuming the reservation, and copy over the contents.
>
> Test:
> Hacked the code locally such that resv_huge_pages underflows produce
> a warning and the copy_huge_page_from_user() always fails, then:
>
> ./tools/testing/selftests/vm/userfaultfd hugetlb_shared 10
> 2 /tmp/kokonut_test/huge/userfaultfd_test && echo test success
> ./tools/testing/selftests/vm/userfaultfd hugetlb 10
> 2 /tmp/kokonut_test/huge/userfaultfd_test && echo test success
>
> Both tests succeed and produce no warnings. After the test runs
> number of free/resv hugepages is correct.
>
> ...
>
> include/linux/hugetlb.h | 4 ++
> mm/hugetlb.c | 103 ++++++++++++++++++++++++++++++++++++----
> mm/migrate.c | 39 +++------------
> 3 files changed, 103 insertions(+), 43 deletions(-)
I'm assuming we want this in -stable?
Are we able to identify a Fixes: for this?
It's a large change. Can we come up with some smaller and easier to
review and integrate version which we can feed into 5.13 and -stable
and do the fancier version for 5.14?
If you don't think -stable needs this then this version will be OK as-is.