[RFC PATCH 00/14] Support multiple KVM modules on the same host

From: Anish Ghulati
Date: Tue Nov 07 2023 - 15:20:33 EST


This series is a rough, PoC-quality RFC to allow (un)loading and running
multiple KVM modules simultaneously on a single host, e.g. to deploy
fixes, mitigations, and/or new features without having to drain all VMs
from the host. Multi-KVM will also allow running the "same" KVM module
with different params, e.g. to run trusted VMs with different mitigations.

The goal of this RFC is to get feedback on the idea itself and the
high-level approach. In particular, we're looking for input on:

- Combining kvm_intel.ko and kvm_amd.ko into kvm.ko
- Exposing multiple /dev/kvmX devices via Kconfig
- The name and prefix of the new base module

Feedback on individual patches is also welcome, but please keep in mind
that this is very much a work in-progress

This builds on Sean's series to hide KVM internals:

https://lore.kernel.org/lkml/20230916003118.2540661-1-seanjc@xxxxxxxxxx

The whole thing can be found at:

https://github.com/asg-17/linux vac-rfc

The basic gist of the approach is to:

- Move system-wide virtualization resource management to a new base
module to avoid collisions between different KVM modules, e.g. VPIDs
and ASIDs need to be unique per VM, and callbacks from IRQ handlers need
to be mediated so that things like PMIs get to the right KVM instance.

- Refactor KVM to make all upgradable assets visible only to KVM, i.e.
make KVM a black box, so that the layout/size of things like "struct
kvm_vcpu" isn't exposed to the kernel at-large.

- Fold kvm_intel.ko and kvm_amd.ko into kvm.ko to avoid complications
having to generate unique symbols for every symbol exported by kvm.ko.

- Add a Kconfig string to allow defining a device and module postfix at
build time, e.g. to create kvmX.ko and /dev/kvmX.

The proposed name of the new base module is vac.ko, a.k.a.
Virtualization Acceleration Code (Unupgradable Units Module). Childish
humor aside, "vac" is a unique name in the kernel and hopefully in x86
and hardware terminology, is a unique name in the kernel and hopefully
in x86 and hardware terminology, e.g. `git grep vac_` yields no hits in
the kernel. It also has the same number of characters as "kvm", e.g.
the namespace can be modified without needing whitespace adjustment if
we want to go that route.

Requirements / Goals / Notes:
- Fully opt-in and backwards compatible (except for the disappearance
of kvm_{amd,intel}.ko).

- User space ultimately controls and is responsible for deployment,
usage, lifecycles, etc. Standard module refcounting applies, but
ensuruing that a VM is created with the "right" KVM module is a user
space problem.

- No user space *VMM* changes are required, e.g. /dev/kvm can be
presented to a VMM by symlinking /dev/kvmX.

- Mutually exclusive with subsytems that have a hard dependency on KVM,
i.e. KVMGT.

- x86 only (for the foreseeable future).

Anish Ghulati (13):
KVM: x86: Move common module params from SVM/VMX to x86
KVM: x86: Fold x86 vendor modules into the main KVM modules
KVM: x86: Remove unused exports
KVM: x86: Create stubs for a new VAC module
KVM: x86: Refactor hardware enable/disable operations into a new file
KVM: x86: Move user return msr operations out of KVM
KVM: SVM: Move shared SVM data structures into VAC
KVM: VMX: Move shared VMX data structures into VAC
KVM: VMX: Move VMX enable and disable into VAC
KVM: SVM: Move SVM enable and disable into VAC
KVM: x86: Move VMX and SVM support checks into VAC
KVM: x86: VAC: Move all hardware enable/disable code into VAC
KVM: VAC: Bring up VAC as a new module

Venkatesh Srinivas (1):
KVM: x86: Move shared KVM state into VAC

arch/x86/include/asm/kvm-x86-ops.h | 3 +-
arch/x86/include/asm/kvm_host.h | 12 +-
arch/x86/kernel/nmi.c | 2 +-
arch/x86/kvm/Kconfig | 29 +-
arch/x86/kvm/Makefile | 31 ++-
arch/x86/kvm/cpuid.c | 8 +-
arch/x86/kvm/hyperv.c | 2 -
arch/x86/kvm/irq.c | 3 -
arch/x86/kvm/irq_comm.c | 2 -
arch/x86/kvm/kvm_onhyperv.c | 3 -
arch/x86/kvm/lapic.c | 15 -
arch/x86/kvm/mmu/mmu.c | 12 -
arch/x86/kvm/mmu/spte.c | 4 -
arch/x86/kvm/mtrr.c | 1 -
arch/x86/kvm/pmu.c | 2 -
arch/x86/kvm/svm/nested.c | 4 +-
arch/x86/kvm/svm/sev.c | 2 +-
arch/x86/kvm/svm/svm.c | 224 ++-------------
arch/x86/kvm/svm/svm.h | 21 +-
arch/x86/kvm/svm/svm_data.h | 23 ++
arch/x86/kvm/svm/svm_ops.h | 1 +
arch/x86/kvm/svm/vac.c | 172 ++++++++++++
arch/x86/kvm/svm/vac.h | 20 ++
arch/x86/kvm/vac.c | 214 +++++++++++++++
arch/x86/kvm/vac.h | 69 +++++
arch/x86/kvm/vmx/nested.c | 6 +-
arch/x86/kvm/vmx/vac.c | 287 +++++++++++++++++++
arch/x86/kvm/vmx/vac.h | 20 ++
arch/x86/kvm/vmx/vmx.c | 332 +++-------------------
arch/x86/kvm/vmx/vmx.h | 2 -
arch/x86/kvm/vmx/vmx_ops.h | 1 +
arch/x86/kvm/x86.c | 423 ++---------------------------
arch/x86/kvm/x86.h | 15 +-
include/linux/kvm_host.h | 2 +
virt/kvm/Makefile.kvm | 14 +-
virt/kvm/kvm_main.c | 210 +-------------
virt/kvm/vac.c | 192 +++++++++++++
virt/kvm/vac.h | 40 +++
38 files changed, 1212 insertions(+), 1211 deletions(-)
create mode 100644 arch/x86/kvm/svm/svm_data.h
create mode 100644 arch/x86/kvm/svm/vac.c
create mode 100644 arch/x86/kvm/svm/vac.h
create mode 100644 arch/x86/kvm/vac.c
create mode 100644 arch/x86/kvm/vac.h
create mode 100644 arch/x86/kvm/vmx/vac.c
create mode 100644 arch/x86/kvm/vmx/vac.h
create mode 100644 virt/kvm/vac.c
create mode 100644 virt/kvm/vac.h


base-commit: 0b78fc46e5450f08ef92431e569c797a63f31517
--
2.42.0.869.gea05f2083d-goog