[PATCH 00/18] Introducing Core Building Blocks for Hyper-V VSM Emulation
From: Nicolas Saenz Julienne
Date: Sun Jun 09 2024 - 11:50:43 EST
This series introduces core KVM functionality necessary to emulate Hyper-V's
Virtual Secure Mode in a Virtual Machine Monitor (VMM).
Hyper-V's Virtual Secure Mode (VSM) is a virtualization security feature that
leverages the hypervisor to create secure execution environments within a
guest. VSM is documented as part of Microsoft's Hypervisor Top Level Functional
Specification [1]. Security features that build upon VSM, like Windows
Credential Guard, are enabled by default on Windows 11 and are becoming a
prerequisite in some industries.
VSM introduces the concept of Virtual Trust Levels (VTLs). These are
independent execution contexts, each with its own CPU architectural state,
local APIC state, and a different view of memory. They are hierarchical, with
more privileged VTLs having priority over the execution of lower VTLs and
control over lower VTLs' state. Windows leverages these low-level
paravirtualized primitives, as well as the hypervisor's higher trust base, to
prevent guest data exfiltration even when the operating system itself has been
compromised.
As discussed at LPC2023 and in our previous RFC [2], we decided to model each
VTL as a distinct KVM VM. With this approach, and the RWX memory attributes
introduced in this series, we have been able to implement VTL memory
protections in a non-intrusive way, using generic KVM APIs. Additionally, each
CPU's VTL is modeled as a distinct KVM vCPU, owned by the KVM VM tracking that
VTL's state. VTL awareness is fully removed from KVM, and the responsibility
for VTL-aware hypercalls, VTL scheduling, and state transfer is delegated to
userspace.
Series overview:
- 1-8: Introduce a number of Hyper-V hyper-calls, all of which are VTL-aware and
expected to be handled in userspace. Additionally an new VTL-specifc MP
state is introduced.
- 9-10: Pass the instruction length as part of the userspace fault exit data
in order to simplify VSM's secure intercept generation.
- 11-17: Introduce RWX memory attributes as well as extend userspace faults.
- 18: Introduces the main VSM CPUID bit which gates all VTL configuration and
runtime hypercalls.
The series is accompanied by two repositories:
- A PoC QEMU implementation of VSM [3]: This PoC VSM implementation is capable
of booting Windows Server 2016 and 2019 with Credential Guard (CG) enabled
on VMs of any size or vCPUs number. It's generally stable, but still sees
its share of crashes. The PoC itself implements VSM interfaces to
accommodate CG's needs, and it's by no means comprehensive. All in all,
don't expect anything usable in production.
- VSM kvm-unit-tests [4]: They cover all VSM hypercalls, as well as KVM APIs
introduced by this series. But unfortunately depends on the QEMU
implementation.
We mostly tested on an Intel machine, both with and without TDP. Basic tests
were also run on AMD (build and kvm-unit-tests). Please note that v2 will
include KVM self-tests to close the testing gap, and allow merging this while
we work on the userspace bits.
The series is based on 'kvm/master', that is, commit db574f2f96d0, and also
available in github [5].
This series also serves as a call-out to anyone interested in collaborating. We
have a proven design, a working PoC, and hopefully a path forward to merge
these KVM APIs. There is plenty to do in both QEMU and KVM still, I'll post a
list of ideas in the future. Feel free to get in touch!
Thanks,
Nicolas
[1] https://raw.githubusercontent.com/Microsoft/Virtualization-Documentation/master/tlfs/Hypervisor%20Top%20Level%20Functional%20Specification%20v6.0b.pdf
[2] https://lore.kernel.org/lkml/20231108111806.92604-1-nsaenz@xxxxxxxxxx/
[3] https://github.com/vianpl/qemu/tree/vsm-v1
[4] https://github.com/vianpl/kvm-unit-tests/tree/vsm-v1
[4] https://github.com/vianpl/linux/tree/vsm-v1
---
Anish Moorthy (1):
KVM: Define and communicate KVM_EXIT_MEMORY_FAULT RWX flags to
userspace
Nicolas Saenz Julienne (17):
KVM: x86: hyper-v: Introduce XMM output support
KVM: x86: hyper-v: Introduce helpers to check if VSM is exposed to
guest
hyperv-tlfs: Update struct hv_send_ipi{_ex}'s declarations
KVM: x86: hyper-v: Introduce VTL awareness to Hyper-V's PV-IPIs
KVM: x86: hyper-v: Introduce MP_STATE_HV_INACTIVE_VTL
KVM: x86: hyper-v: Exit on Get/SetVpRegisters hcall
KVM: x86: hyper-v: Exit on TranslateVirtualAddress hcall
KVM: x86: hyper-v: Exit on StartVirtualProcessor and
GetVpIndexFromApicId hcalls
KVM: x86: Keep track of instruction length during faults
KVM: x86: Pass the instruction length on memory fault user-space exits
KVM: x86/mmu: Introduce infrastructure to handle non-executable
mappings
KVM: x86/mmu: Avoid warning when installing non-private memory
attributes
KVM: x86/mmu: Init memslot if memory attributes available
KVM: Introduce RWX memory attributes
KVM: x86: Take mem attributes into account when faulting memory
KVM: Introduce traces to track memory attributes modification.
KVM: x86: hyper-v: Handle VSM hcalls in user-space
Documentation/virt/kvm/api.rst | 107 +++++++++++++++++++++++-
arch/x86/hyperv/hv_apic.c | 3 +-
arch/x86/include/asm/hyperv-tlfs.h | 2 +-
arch/x86/kvm/Kconfig | 1 +
arch/x86/kvm/hyperv.c | 127 +++++++++++++++++++++++++++--
arch/x86/kvm/hyperv.h | 18 ++++
arch/x86/kvm/mmu/mmu.c | 91 +++++++++++++++++----
arch/x86/kvm/mmu/mmu_internal.h | 9 +-
arch/x86/kvm/mmu/mmutrace.h | 29 +++++++
arch/x86/kvm/mmu/paging_tmpl.h | 2 +-
arch/x86/kvm/mmu/tdp_mmu.c | 8 +-
arch/x86/kvm/svm/svm.c | 7 +-
arch/x86/kvm/vmx/vmx.c | 23 +++++-
arch/x86/kvm/x86.c | 17 +++-
include/asm-generic/hyperv-tlfs.h | 16 +++-
include/linux/kvm_host.h | 45 +++++++++-
include/trace/events/kvm.h | 20 +++++
include/uapi/linux/kvm.h | 15 ++++
virt/kvm/kvm_main.c | 35 +++++++-
19 files changed, 527 insertions(+), 48 deletions(-)
--
2.40.1