[PATCH v9 0/6] KVM: allow mapping non-refcounted pages

From: David Stevens
Date: Sun Sep 10 2023 - 22:17:03 EST


From: David Stevens <stevensd@xxxxxxxxxxxx>

This patch series adds support for mapping VM_IO and VM_PFNMAP memory
that is backed by struct pages that aren't currently being refcounted
(e.g. tail pages of non-compound higher order allocations) into the
guest.

Our use case is virtio-gpu blob resources [1], which directly map host
graphics buffers into the guest as "vram" for the virtio-gpu device.
This feature currently does not work on systems using the amdgpu driver,
as that driver allocates non-compound higher order pages via
ttm_pool_alloc_page.

First, this series replaces the __gfn_to_pfn_memslot API with a more
extensible __kvm_faultin_pfn API. The updated API rearranges
__gfn_to_pfn_memslot's args into a struct and where possible packs the
bool arguments into a FOLL_ flags argument. The refactoring changes do
not change any behavior.

>From there, this series extends the __kvm_faultin_pfn API so that
non-refconuted pages can be safely handled. This invloves adding an
input parameter to indicate whether the caller can safely use
non-refcounted pfns and an output parameter to tell the caller whether
or not the returned page is refcounted. This change includes a breaking
change, by disallowing non-refcounted pfn mappings by default, as such
mappings are unsafe. To allow such systems to continue to function, an
opt-in module parameter is added to allow the unsafe behavior.

This series only adds support for non-refcounted pages to x86. Other
MMUs can likely be updated without too much difficulty, but it is not
needed at this point. Updating other parts of KVM (e.g. pfncache) is not
straightforward [2].

[1]
https://patchwork.kernel.org/project/dri-devel/cover/20200814024000.2485-1-gurchetansingh@xxxxxxxxxxxx/
[2] https://lore.kernel.org/all/ZBEEQtmtNPaEqU1i@xxxxxxxxxx/

v8 -> v9:
- Make paying attention to is_refcounted_page mandatory. This means
that FOLL_GET is no longer necessary. For compatibility with
un-migrated callers, add a temporary parameter to sidestep
ref-counting issues.
- Add allow_unsafe_mappings, which is a breaking change.
- Migrate kvm_vcpu_map and other callsites used by x86 to the new API.
- Drop arm and ppc changes.
v7 -> v8:
- Set access bits before releasing mmu_lock.
- Pass FOLL_GET on 32-bit x86 or !tdp_enabled.
- Refactor FOLL_GET handling, add kvm_follow_refcounted_pfn helper.
- Set refcounted bit on >4k pages.
- Add comments and apply formatting suggestions.
- rebase on kvm next branch.
v6 -> v7:
- Replace __gfn_to_pfn_memslot with a more flexible __kvm_faultin_pfn,
and extend that API to support non-refcounted pages (complete
rewrite).

David Stevens (5):
KVM: mmu: Introduce __kvm_follow_pfn function
KVM: mmu: Improve handling of non-refcounted pfns
KVM: Migrate kvm_vcpu_map to __kvm_follow_pfn
KVM: x86: Migrate to __kvm_follow_pfn
KVM: x86/mmu: Handle non-refcounted pages

Sean Christopherson (1):
KVM: Assert that a page's refcount is elevated when marking
accessed/dirty

arch/x86/kvm/mmu/mmu.c | 93 +++++++---
arch/x86/kvm/mmu/mmu_internal.h | 1 +
arch/x86/kvm/mmu/paging_tmpl.h | 8 +-
arch/x86/kvm/mmu/spte.c | 4 +-
arch/x86/kvm/mmu/spte.h | 12 +-
arch/x86/kvm/mmu/tdp_mmu.c | 22 ++-
arch/x86/kvm/x86.c | 12 +-
include/linux/kvm_host.h | 42 ++++-
virt/kvm/kvm_main.c | 294 +++++++++++++++++++-------------
virt/kvm/kvm_mm.h | 3 +-
virt/kvm/pfncache.c | 11 +-
11 files changed, 339 insertions(+), 163 deletions(-)

--
2.42.0.283.g2d96d420d3-goog