Re: [PATCH RFC 00/17] mm, kvm: allow uffd suppot in guest_memfd
From: Peter Xu
Date: Tue Feb 03 2026 - 15:56:28 EST
On Tue, Jan 27, 2026 at 09:29:19PM +0200, Mike Rapoport wrote:
> From: "Mike Rapoport (Microsoft)" <rppt@xxxxxxxxxx>
>
> Hi,
>
> These patches enable support for userfaultfd in guest_memfd.
> They are quite different from the latest posting [1] so I'm restarting the
> versioning. As there was a lot of tension around the topic, this is an RFC
> to get some feedback and see how we can move forward.
>
> As the ground work I refactored userfaultfd handling of PTE-based memory types
> (anonymous and shmem) and converted them to use vm_uffd_ops for allocating a
> folio or getting an existing folio from the page cache. shmem also implements
> callbacks that add a folio to the page cache after the data passed in
> UFFDIO_COPY was copied and remove the folio from the page cache if page table
> update fails.
>
> In order for guest_memfd to notify userspace about page faults, there are new
> VM_FAULT_UFFD_MINOR and VM_FAULT_UFFD_MISSING that a ->fault() handler can
> return to inform the page fault handler that it needs to call
> handle_userfault() to complete the fault.
>
> Nikita helped to plumb these new goodies into guest_memfd and provided basic
> tests to verify that guest_memfd works with userfaultfd.
>
> I deliberately left hugetlb out, at least for the most part.
> hugetlb handles acquisition of VMA and more importantly establishing of parent
> page table entry differently than PTE-based memory types. This is a different
> abstraction level than what vm_uffd_ops provides and people objected to
> exposing such low level APIs as a part of VMA operations.
>
> Also, to enable uffd in guest_memfd refactoring of hugetlb is not needed and I
> prefer to delay it until the dust settles after the changes in this set.
>
> [1] https://lore.kernel.org/all/20251130111812.699259-1-rppt@xxxxxxxxxx
>
> Mike Rapoport (Microsoft) (12):
> userfaultfd: introduce mfill_copy_folio_locked() helper
> userfaultfd: introduce struct mfill_state
> userfaultfd: introduce mfill_get_pmd() helper.
> userfaultfd: introduce mfill_get_vma() and mfill_put_vma()
> userfaultfd: retry copying with locks dropped in mfill_atomic_pte_copy()
> userfaultfd: move vma_can_userfault out of line
> userfaultfd: introduce vm_uffd_ops
> userfaultfd, shmem: use a VMA callback to handle UFFDIO_CONTINUE
> userfaultfd: introduce vm_uffd_ops->alloc_folio()
> shmem, userfaultfd: implement shmem uffd operations using vm_uffd_ops
> userfaultfd: mfill_atomic() remove retry logic
> mm: introduce VM_FAULT_UFFD_MINOR fault reason
>
> Nikita Kalyazin (5):
> mm: introduce VM_FAULT_UFFD_MISSING fault reason
> KVM: guest_memfd: implement userfaultfd minor mode
> KVM: guest_memfd: implement userfaultfd missing mode
> KVM: selftests: test userfaultfd minor for guest_memfd
> KVM: selftests: test userfaultfd missing for guest_memfd
>
> include/linux/mm.h | 5 +
> include/linux/mm_types.h | 15 +-
> include/linux/shmem_fs.h | 14 -
> include/linux/userfaultfd_k.h | 74 +-
> mm/hugetlb.c | 21 +
> mm/memory.c | 8 +-
> mm/shmem.c | 188 +++--
> mm/userfaultfd.c | 671 ++++++++++--------
> .../testing/selftests/kvm/guest_memfd_test.c | 191 +++++
> virt/kvm/guest_memfd.c | 134 +++-
> 10 files changed, 871 insertions(+), 450 deletions(-)
Mike,
The idea looks good to me, thanks for this work! Your process on
UFFDIO_COPY over anon/shmem is nice to me.
If you remember, I used to raise a concern on introducing two new fault
retvals only for userfaultfd:
https://lore.kernel.org/all/aShb8J18BaRrsA-u@x1.local/
IMHO they're not only unnecessarily leaking userfaultfd information into
fault core definitions, but also cause code duplications. I still think we
should avoid them.
This time, I've attached a smoke tested patch removing both of them.
It's pretty small and it runs all fine with all old/new userfaultfd tests
(including gmem ones). Feel free to have a look at the end.
I understand you want to avoid adding mnore complexity to this series, if
you want I can also prepare such a patch after this series landed to remove
the two retvals. I'd still would like to know how you think about it,
though, let me know if you have any comments.
Note that it may indeed need some perf tests to make sure there's zero
overhead after this change. Currently there's still some trivial overheads
(e.g. unnecessary folio locks), but IIUC we can even avoid that.
Thanks,
===8<===