Re: [RFC PATCH 0/5] KVM: guest_memfd: support for uffd missing
From: Peter Xu
Date: Thu Mar 13 2025 - 18:38:24 EST
On Thu, Mar 13, 2025 at 10:13:23PM +0000, Nikita Kalyazin wrote:
> Yes, that's right, mmap() + memcpy() is functionally sufficient. write() is
> an optimisation. Most of the pages in guest_memfd are only ever accessed by
> the vCPU (not userspace) via TDP (stage-2 pagetables) so they don't need
> userspace pagetables set up. By using write() we can avoid VMA faults,
> installing corresponding PTEs and double page initialisation we discussed
> earlier. The optimised path only contains pagecache population via write().
> Even TDP faults can be avoided if using KVM prefaulting API [1].
>
> [1] https://docs.kernel.org/virt/kvm/api.html#kvm-pre-fault-memory
Could you elaborate why VMA faults matters in perf?
If we're talking about postcopy-like migrations on top of KVM guest-memfd,
IIUC the VMAs can be pre-faulted too just like the TDP pgtables, e.g. with
MADV_POPULATE_WRITE.
Normally, AFAIU userapp optimizes IOs the other way round.. to change
write()s into mmap()s, which at least avoids one round of copy.
For postcopy using minor traps (and since guest-memfd is always shared and
non-private..), it's also possible to feed the mmap()ed VAs to NIC as
buffers (e.g. in recvmsg(), for example, as part of iovec[]), and as long
as the mmap()ed ranges are not registered by KVM memslots, there's no
concern on non-atomic copy.
Thanks,
--
Peter Xu