Re: [PATCH v2 08/14] userfaultfd: add UFFDIO_REGISTER_MODE_RWP and UFFDIO_RWPROTECT plumbing
From: Mike Rapoport
Date: Tue May 12 2026 - 13:20:51 EST
On Fri, May 08, 2026 at 04:55:20PM +0100, Kiryl Shutsemau (Meta) wrote:
> Add the userspace interface for read-write protection tracking:
>
> - UFFDIO_REGISTER_MODE_RWP register a range for RWP tracking
> - UFFD_FEATURE_RWP capability bit
> - UFFDIO_RWPROTECT install / remove RWP on a range
>
> Registration sets VM_UFFD_RWP on the VMA. Combining MODE_WP with
> MODE_RWP is rejected because both modes claim the uffd PTE bit.
>
> UFFDIO_RWPROTECT is the bidirectional counterpart of
> UFFDIO_WRITEPROTECT:
>
> - MODE_RWP change_protection() with MM_CP_UFFD_RWP
> installs PAGE_NONE and sets the uffd bit on
> present PTEs
> - !MODE_RWP change_protection() with MM_CP_UFFD_RWP_RESOLVE
> restores vma->vm_page_prot and clears the bit
>
> userfaultfd_clear_vma() runs the same resolve pass on unregister so
> RWP state cannot outlive the uffd.
>
> Re-registering a range must not drop a mode that installs per-PTE
> markers (WP or RWP); doing so returns -EBUSY. This also closes a
> pre-existing window where re-registering without MODE_WP would strand
> uffd-wp markers: before, those caused extra write-faults but were
> otherwise benign; with RWP preservation in place, a subsequent
> mprotect() on a VM_UFFD_RWP VMA would silently promote the stale
> markers to RWP.
>
> The feature is not yet advertised. UFFDIO_REGISTER_MODE_RWP,
> UFFD_FEATURE_RWP, and _UFFDIO_RWPROTECT are intentionally absent from
> UFFD_API_REGISTER_MODES, UFFD_API_FEATURES, and UFFD_API_RANGE_IOCTLS,
> so UFFDIO_API masks them out and the register-mode validator rejects
> the bit. The follow-up patch adds fault dispatch and exposes the UAPI.
>
> Signed-off-by: Kiryl Shutsemau <kas@xxxxxxxxxx>
> Assisted-by: Claude:claude-opus-4-6
Reviewed-by: Mike Rapoport (Microsoft) <rppt@xxxxxxxxxx>
with a comment below
> ---
> Documentation/admin-guide/mm/userfaultfd.rst | 10 ++
> fs/userfaultfd.c | 84 +++++++++++++++++
> include/linux/userfaultfd_k.h | 2 +
> include/uapi/linux/userfaultfd.h | 19 ++++
> mm/userfaultfd.c | 97 +++++++++++++++++++-
> 5 files changed, 209 insertions(+), 3 deletions(-)
>
> + /*
> + * Pre-scan the range: validate every spanned VMA before applying
> + * any change_protection() so a partial failure cannot leave the
> + * process with only a prefix of the range re-protected.
> + */
> + err = -ENOENT;
> + for_each_vma_range(vmi, dst_vma, end) {
> + if (!userfaultfd_rwp(dst_vma))
> + return -ENOENT;
> +
> + if (is_vm_hugetlb_page(dst_vma)) {
> + unsigned long page_mask;
> +
> + page_mask = vma_kernel_pagesize(dst_vma) - 1;
> + if ((start & page_mask) || (len & page_mask))
> + return -EINVAL;
> + }
> + err = 0;
> + }
> + if (err)
> + return err;
It's an interesting way to say "no VMA found in range" :)
I think bool found and
if (!found)
return -ENOENT;
looks more readable.
--
Sincerely yours,
Mike.