Re: [PATCH 5/6] userfaultfd: gate must_wait writability check on pte_present()

From: Lorenzo Stoakes

Date: Mon Jun 01 2026 - 14:19:27 EST


On Fri, May 29, 2026 at 06:23:29PM +0100, Kiryl Shutsemau (Meta) wrote:
> userfaultfd_must_wait() and userfaultfd_huge_must_wait() read the PTE
> without taking the page table lock and then apply pte_write() /
> huge_pte_write() to it. Those accessors decode bits from the present
> encoding only; on a swap or migration entry they read the offset bits
> that happen to share the same position and return an undefined result.
>
> The intent of the check is "is this fault still WP-blocked?". A
> non-marker swap entry means the page is in transit -- the userfault
> context the original fault delivered against is no longer the same,
> and the swap-in or migration completion path will re-deliver a fresh
> fault if userspace still needs to handle it. Worst case under the
> current code the garbage write bit says "wait", and the thread stays
> asleep until a UFFDIO_WAKE that may never arrive.
>
> Gate the writability check on pte_present() so the lockless re-check
> only inspects present-PTE bits when the entry is actually present.
> The non-present, non-marker case returns "don't wait" and lets the
> fault path retry.
>
> Fixes: 369cd2121be4 ("userfaultfd: hugetlbfs: userfaultfd_huge_must_wait for hugepmd ranges")
> Fixes: 63b2d4174c4a ("userfaultfd: wp: add the writeprotect API to userfaultfd ioctl")
> Cc: stable@xxxxxxxxxxxxxxx
> Reported-by: Sashiko AI review <sashiko-bot@xxxxxxxxxx>
> Signed-off-by: Kiryl Shutsemau <kas@xxxxxxxxxx>

One tiny nit is maybe could mention softleaf :P but it's not important!

LGTM, so:

Reviewed-by: Lorenzo Stoakes <ljs@xxxxxxxxxx>

> ---
> mm/userfaultfd.c | 20 ++++++++++++++++++++
> 1 file changed, 20 insertions(+)
>
> diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c
> index 35b206cc9aa6..f6d2a1c67019 100644
> --- a/mm/userfaultfd.c
> +++ b/mm/userfaultfd.c
> @@ -2535,6 +2535,15 @@ static inline bool userfaultfd_huge_must_wait(struct userfaultfd_ctx *ctx,
> /* UFFD PTE markers require userspace to resolve the fault. */
> if (pte_is_uffd_marker(pte))
> return true;
> + /*
> + * Concurrent migration may have replaced the present PTE with a
> + * non-marker swap entry between fault delivery and this lockless
> + * re-check. huge_pte_write() on a swap entry decodes random offset
> + * bits, so gate it on pte_present(). The migration completion path
> + * will re-deliver the fault if it still needs userspace.
> + */
> + if (!pte_present(pte))
> + return false;
> /*
> * If VMA has UFFD WP faults enabled and WP fault, wait for userspace to
> * resolve the fault.
> @@ -2621,6 +2630,17 @@ static inline bool userfaultfd_must_wait(struct userfaultfd_ctx *ctx,
> /* UFFD PTE markers require userspace to resolve the fault. */
> if (pte_is_uffd_marker(ptent))
> goto out;
> + /*
> + * Concurrent swap-out / migration may have replaced the present PTE
> + * with a non-marker swap entry between fault delivery and this
> + * lockless re-check. pte_write() on a swap entry decodes random
> + * offset bits, so gate it on pte_present(). The page-in path will
> + * re-deliver the fault if it still needs userspace.
> + */
> + if (!pte_present(ptent)) {
> + ret = false;
> + goto out;
> + }
> /*
> * If VMA has UFFD WP faults enabled and WP fault, wait for userspace to
> * resolve the fault.
> --
> 2.54.0
>