Re: [PATCH v7 0/2] KVM: guest_memfd: use write for population

From: Nikita Kalyazin

Date: Fri Nov 14 2025 - 10:23:48 EST




On 14/11/2025 15:18, Kalyazin, Nikita wrote:
On systems that support shared guest memory, write() is useful, for
example, for population of the initial image. Even though the same can
also be achieved via userspace mapping and memcpying from userspace,
write() provides a more performant option because it does not need to
set user page tables and it does not cause a page fault for every page
like memcpy would. Note that memcpy cannot be accelerated via
MADV_POPULATE_WRITE as it is not supported by guest_memfd and relies on
GUP.

Populating 512MiB of guest_memfd on a x86 machine:
- via memcpy: 436 ms
- via write: 202 ms (-54%)

Only PAGE_ALIGNED offset and len are allowed. Even though non-aligned
writes are technically possible, when in-place conversion support is
implemented [1], the restriction makes handling of mixed shared/private
huge pages simpler. write() will only be allowed to populate shared
pages.

When direct map removal is implemented [2]
- write() will not be allowed to access pages that have already
been removed from direct map
- on completion, write() will remove the populated pages from
direct map

While it is technically possible to implement read() syscall on systems
with shared guest memory, it is not supported as there is currently no
use case for it.

[1]
https://lore.kernel.org/kvm/cover.1760731772.git.ackerleytng@xxxxxxxxxx
[2]
https://lore.kernel.org/kvm/20250924151101.2225820-1-patrick.roy@xxxxxxxxxxxxx

I failed to include links to previous versions:

v7:
- Sean: add GUEST_MEMFD_FLAG_WRITE and documentation for it
- Ackerley: only allow PAGE_ALIGNED offset and len
- Sean/Ackerley: formatting fixes

v6:
- https://lore.kernel.org/kvm/20251020161352.69257-1-kalyazin@xxxxxxxxxx
- Make write support conditional on mmap support instead of relying on
the up-to-date flag to decide whether writing to a page is allowed
- James: Remove dependencies on folio_test_large
- James: Remove page alignment restriction
- James: Formatting fixes

v5:
- https://lore.kernel.org/kvm/20250902111951.58315-1-kalyazin@xxxxxxxxxx
- Replace the call to the unexported filemap_remove_folio with
zeroing the bytes that could not be copied
- Fix checkpatch findings

v4:
- https://lore.kernel.org/kvm/20250828153049.3922-1-kalyazin@xxxxxxxxxx
- Switch from implementing the write callback to write_iter
- Remove conditional compilation

v3:
- https://lore.kernel.org/kvm/20250303130838.28812-1-kalyazin@xxxxxxxxxx
- David/Mike D: Only compile support for the write syscall if
CONFIG_KVM_GMEM_SHARED_MEM (now gone) is enabled.
v2:
- https://lore.kernel.org/kvm/20241129123929.64790-1-kalyazin@xxxxxxxxxx
- Switch from an ioctl to the write syscall to implement population

v1:
- https://lore.kernel.org/kvm/20241024095429.54052-1-kalyazin@xxxxxxxxxx


Nikita Kalyazin (2):
KVM: guest_memfd: add generic population via write
KVM: selftests: update guest_memfd write tests

Documentation/virt/kvm/api.rst | 2 +
include/linux/kvm_host.h | 2 +-
include/uapi/linux/kvm.h | 1 +
.../testing/selftests/kvm/guest_memfd_test.c | 58 +++++++++++++++++--
virt/kvm/guest_memfd.c | 52 +++++++++++++++++
5 files changed, 108 insertions(+), 7 deletions(-)


base-commit: 8a4821412cf2c1429fffa07c012dd150f2edf78c
--
2.50.1