Re: [RFC PATCH 0/4] KVM: ioctl for populating guest_memfd

From: Paolo Bonzini
Date: Wed Nov 20 2024 - 08:56:16 EST


On 10/24/24 11:54, Nikita Kalyazin wrote:
Firecracker currently allows to populate guest memory from a separate
process via UserfaultFD [1]. This helps keep the VMM codebase and
functionality concise and generic, while offloading the logic of
obtaining guest memory to another process. UserfaultFD is currently not
supported for guest_memfd, because it binds to a VMA, while guest_memfd
does not need to (or cannot) be necessarily mapped to userspace,
especially for private memory. [2] proposes an alternative to
UserfaultFD for intercepting stage-2 faults, while this series
conceptually compliments it with the ability to populate guest memory
backed by guest_memfd for `KVM_X86_SW_PROTECTED_VM` VMs.

Patches 1-3 add a new ioctl, `KVM_GUEST_MEMFD_POPULATE`, that uses a
vendor-agnostic implementation of `post_populate` callback.

Patch 4 allows to call the ioctl from a separate (non-VMM) process. It
has been prohibited by [3], but I have not been able to locate the exact
justification for the requirement.

The justification is that the "struct kvm" has a long-lived tie to a host process's address space.

Invoking ioctls like KVM_SET_USER_MEMORY_REGION and KVM_RUN from different processes would make things very messy, because it is not clear which mm you are working with: the MMU notifier is registered for kvm->mm, but some functions such as get_user_pages do not take an mm for example and always operate on current->mm.

In your case, it should be enough to add a ioctl on the guestmemfd instead? But the real question is, what are you using KVM_X86_SW_PROTECTED_VM for?

Paolo

Questions:
- Does exposing a generic population interface via ioctl look
sensible in this form?
- Is there a path where "only VMM can call KVM API" requirement is
relaxed? If not, what is the recommended efficient alternative for
populating guest memory from outside the VMM?

[1]: https://github.com/firecracker-microvm/firecracker/blob/main/docs/snapshotting/handling-page-faults-on-snapshot-resume.md
[2]: https://lore.kernel.org/kvm/CADrL8HUHRMwUPhr7jLLBgD9YLFAnVHc=N-C=8er-x6GUtV97pQ@xxxxxxxxxxxxxx/T/
[3]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=6d4e4c4fca5be806b888d606894d914847e82d78

Nikita

Nikita Kalyazin (4):
KVM: guest_memfd: add generic post_populate callback
KVM: add KVM_GUEST_MEMFD_POPULATE ioctl for guest_memfd
KVM: allow KVM_GUEST_MEMFD_POPULATE in another mm
KVM: document KVM_GUEST_MEMFD_POPULATE ioctl

Documentation/virt/kvm/api.rst | 23 +++++++++++++++++++++++
include/linux/kvm_host.h | 3 +++
include/uapi/linux/kvm.h | 9 +++++++++
virt/kvm/guest_memfd.c | 28 ++++++++++++++++++++++++++++
virt/kvm/kvm_main.c | 19 ++++++++++++++++++-
5 files changed, 81 insertions(+), 1 deletion(-)


base-commit: c8d430db8eec7d4fd13a6bea27b7086a54eda6da