Re: [PATCH v2 05/24] shmem/userfaultfd: Handle uffd-wp special pte in page fault handler

From: Peter Xu
Date: Tue Apr 27 2021 - 14:54:19 EST


On Tue, Apr 27, 2021 at 12:12:58PM -0400, Peter Xu wrote:
> File-backed memories are prone to unmap/swap so the ptes are always unstable.
> This could lead to userfaultfd-wp information got lost when unmapped or swapped
> out on such types of memory, for example, shmem. To keep such an information
> persistent, we will start to use the newly introduced swap-like special ptes to
> replace a null pte when those ptes were removed.
>
> Prepare this by handling such a special pte first before it is applied. Here
> a new fault flag FAULT_FLAG_UFFD_WP is introduced. When this flag is set, it

FAULT_FLAG_UFFD_WP does not exist any more. Obviously I should have touched up
the commit message when touching up the code...

> means the current fault is to resolve a page access (either read or write) to
> the uffd-wp special pte.
>
> The handling of this special pte page fault is similar to missing fault, but it
> should happen after the pte missing logic since the special pte is designed to
> be a swap-like pte. Meanwhile it should be handled before do_swap_page() so
> that the swap core logic won't be confused to see such an illegal swap pte.
>
> This is a slow path of uffd-wp handling, because unmap of wr-protected shmem
> ptes should be rare. So far it should only trigger in two conditions:
>
> (1) When trying to punch holes in shmem_fallocate(), there will be a
> pre-unmap optimization before evicting the page. That will create
> unmapped shmem ptes with wr-protected pages covered.
>
> (2) Swapping out of shmem pages
>
> Because of this, the page fault handling is simplifed too by not sending the
> wr-protect message in the 1st page fault, instead the page will be installed
> read-only, so the message will be generated until the next do_wp_page() call.
>
> Disable fault-around for such a special page fault, because the introduced new
> flag (FAULT_FLAG_UFFD_WP) only applies to current pte rather than all the pages

Same here.

> around it. Doing fault-around with the new flag could confuse all the rest of
> pages when installing ptes from page cache when there's a cache hit.
>
> Signed-off-by: Peter Xu <peterx@xxxxxxxxxx>
> ---
> include/linux/userfaultfd_k.h | 11 +++++
> mm/memory.c | 80 ++++++++++++++++++++++++++++++++---
> 2 files changed, 86 insertions(+), 5 deletions(-)
>
> diff --git a/include/linux/userfaultfd_k.h b/include/linux/userfaultfd_k.h
> index bc733512c6905..fefebe6e96560 100644
> --- a/include/linux/userfaultfd_k.h
> +++ b/include/linux/userfaultfd_k.h
> @@ -89,6 +89,17 @@ static inline bool uffd_disable_huge_pmd_share(struct vm_area_struct *vma)
> return vma->vm_flags & (VM_UFFD_WP | VM_UFFD_MINOR);
> }
>
> +/*
> + * Don't do fault around for FAULT_FLAG_UFFD_WP because it means we want to

Same here...

> + * recover a previously wr-protected pte. This flag is a per-pte information,
> + * so it could confuse all the pages around the current page when faulted in.
> + * Similar reason for MINOR mode faults.
> + */
> +static inline bool uffd_disable_fault_around(struct vm_area_struct *vma)
> +{
> + return vma->vm_flags & (VM_UFFD_WP | VM_UFFD_MINOR);
> +}

--
Peter Xu