Re: [PATCH v3 1/4] mm: Introduce vm_uffd_ops API

From: David Hildenbrand

Date: Tue Sep 30 2025 - 05:37:02 EST


On 26.09.25 23:16, Peter Xu wrote:
Currently, most of the userfaultfd features are implemented directly in the
core mm. It will invoke VMA specific functions whenever necessary. So far
it is fine because it almost only interacts with shmem and hugetlbfs.

Introduce a generic userfaultfd API extension for vm_operations_struct,
so that any code that implements vm_operations_struct (including kernel
modules that can be compiled separately from the kernel core) can support
userfaults without modifying the core files.

With this API applied, if a module wants to support userfaultfd, the
module should only need to properly define vm_uffd_ops and hook it to
vm_operations_struct, instead of changing anything in core mm.

This API will not work for anonymous memory. Handling of userfault
operations for anonymous memory remains unchanged in core mm.

Due to a security concern while reviewing older versions of this series
[1], uffd_copy() will be temprorarily removed. IOW, so far MISSING-capable
memory types can only be hard-coded and implemented in mm/. It would also
affect UFFDIO_COPY and UFFDIO_ZEROPAGE. Other functions should still be
able to be provided from vm_uffd_ops.

Introduces the API only so that existing userfaultfd users can be moved
over without breaking them.

[1] https://lore.kernel.org/all/20250627154655.2085903-1-peterx@xxxxxxxxxx/


Looks much better with the uffdio_copy stuff removed for now.

Signed-off-by: Peter Xu <peterx@xxxxxxxxxx>
---
include/linux/mm.h | 9 +++++++++
include/linux/userfaultfd_k.h | 37 +++++++++++++++++++++++++++++++++++
2 files changed, 46 insertions(+)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 6b6c6980f46c2..8afb93387e2c6 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -620,6 +620,8 @@ struct vm_fault {
*/
};
+struct vm_uffd_ops;
+
/*
* These are the virtual MM functions - opening of an area, closing and
* unmapping it (needed to keep files on disk up-to-date etc), pointer
@@ -705,6 +707,13 @@ struct vm_operations_struct {
struct page *(*find_normal_page)(struct vm_area_struct *vma,
unsigned long addr);
#endif /* CONFIG_FIND_NORMAL_PAGE */
+#ifdef CONFIG_USERFAULTFD
+ /*
+ * Userfaultfd related ops. Modules need to define this to support
+ * userfaultfd.
+ */
+ const struct vm_uffd_ops *userfaultfd_ops;
+#endif
};
#ifdef CONFIG_NUMA_BALANCING
diff --git a/include/linux/userfaultfd_k.h b/include/linux/userfaultfd_k.h
index c0e716aec26aa..b1949d8611238 100644
--- a/include/linux/userfaultfd_k.h
+++ b/include/linux/userfaultfd_k.h
@@ -92,6 +92,43 @@ enum mfill_atomic_mode {
NR_MFILL_ATOMIC_MODES,
};
+/* VMA userfaultfd operations */
+struct vm_uffd_ops {
+ /**
+ * @uffd_features: features supported in bitmask.
+ *
+ * When the ops is defined, the driver must set non-zero features
+ * to be a subset (or all) of: VM_UFFD_MISSING|WP|MINOR.
+ *
+ * NOTE: VM_UFFD_MISSING is still only supported under mm/ so far.
+ */
+ unsigned long uffd_features;

This variable name is a bit confusing , because it's all about vma flags, not uffd features. Just reading the variable, I would rather connect it to things like UFFD_FEATURE_WP_UNPOPULATED.

As currently used for VM flags, maybe you should call this

unsigned long uffd_vm_flags;

or sth like that.

I briefly wondered whether we could use actual UFFD_FEATURE_* here, but they are rather unsuited for this case here (e.g., different feature flags for hugetlb support/shmem support etc).

But reading "uffd_ioctls" below, can't we derive the suitable vma flags from the supported ioctls?

_UFFDIO_COPY | _UFDIO_ZEROPAGE -> VM_UFFD_MISSING
_UFFDIO_WRITEPROTECT -> VM_UFFD_WP
_UFFDIO_CONTINUE -> VM_UFFD_MINOR

+ /**
+ * @uffd_ioctls: ioctls supported in bitmask.
+ *
+ * Userfaultfd ioctls supported by the module. Below will always
+ * be supported by default whenever a module provides vm_uffd_ops:
+ *
+ * _UFFDIO_API, _UFFDIO_REGISTER, _UFFDIO_UNREGISTER, _UFFDIO_WAKE
+ *
+ * The module needs to provide all the rest optionally supported
+ * ioctls. For example, when VM_UFFD_MINOR is supported,
+ * _UFFDIO_CONTINUE must be supported as an ioctl.
+ */
+ unsigned long uffd_ioctls;
+ /**
+ * uffd_get_folio: Handler to resolve UFFDIO_CONTINUE request.

Just wondering if we could incorporate the "continue" / "minor" aspect into the callback name.

uffd_minor_get_folio / uffd_continue_get_folio

Or do you see use of that callback in the context of other uffd features?

--
Cheers

David / dhildenb