[PATCH v3 00/25] KVM SGX virtualization support

From: Kai Huang
Date: Fri Mar 19 2021 - 03:30:25 EST

This series adds KVM SGX virtualization support. The first 14 patches starting
with x86/sgx or x86/cpu.. are necessary changes to x86 and SGX core/driver to
support KVM SGX virtualization, while the rest are patches to KVM subsystem.

This series is based against latest tip/x86/sgx, which has Jarkko's NUMA
allocation support.

You can also get the code from upstream branch of kvm-sgx repo on github:

https://github.com/intel/kvm-sgx.git upstream

It also requires Qemu changes to create VM with SGX support. You can find Qemu
repo here:

https://github.com/intel/qemu-sgx.git upstream

Please refer to README.md of above qemu-sgx repo for detail on how to create
guest with SGX support. At meantime, for your quick reference you can use below
command to create SGX guest:

#qemu-system-x86_64 -smp 4 -m 2G -drive file=<your_vm_image>,if=virtio \
-cpu host,+sgx_provisionkey \
-sgx-epc id=epc1,memdev=mem1 \
-object memory-backend-epc,id=mem1,size=64M,prealloc

Please note that the SGX relevant part is:

-cpu host,+sgx_provisionkey \
-sgx-epc id=epc1,memdev=mem1 \
-object memory-backend-epc,id=mem1,size=64M,prealloc

And you can change other parameters of your qemu command based on your needs.


(Changelog here is for global changes. Please see each patch's changelog for
changes made to specific patch.)


- No big change in design, structure of patch series, etc.
- Rebased to lastest tip/x86/sgx, to resolve merge conflict of patch 3
(x86/sgx: Wipe out EREMOVE from sgx_free_epc_page()).
- Addressed some Nit issues found by Sean in v2.
- Also addressed some Nit issues reported by checkpatch.pl. Now there's no
checkpatch issues.
- Updated patch 3 (x86/sgx: Wipe out EREMOVE from sgx_free_epc_page()):
- Removed Jarkko from author, per request.
- Changed to replace all call sites of sgx_free_epc_page() with new
sgx_encl_free_epc_page(), to make this patch doesn't have functional
changes (except a WARN upon EREMOVE failure requestd by Dave).
- Rebased to tip/x86/sgx, which has Jarkko's NUMA allocation.
- Added Jarkko's Acked-by.
- Updated patch 8 (x86/sgx: Expose SGX architectural definitions to the
kernel) to add MAINTAINER file update to include new introduced asm/sgx.h.
- Updated patch 13 (x86/sgx: Add helpers to expose ECREATE and EINIT to KVM)
to use addr and size directly in access_ok()s (which won't be triggered


- No big change in design, structural of patch series, etc.
- Addressed Boris's comments regarding to suppressing both SGX1 and SGX2 in
/proc/cpuinfo, and improvement in feat_ctl.c when enabling SGX (patch 2
and 6).
- Addressed Sean's comments for both x86 part patches and KVM patches (patch 3,
5, 9, 12, 19, 21).
- Addressed Dave's comments in RFC v6 series (patch 13).


- Refined patch (x86/sgx: Wipe out EREMOVE from sgx_free_epc_page()) to print
error msg that EPC page is leaked when EREMOVE failed, requested by Dave.
- Changelog history of all RFC series is removed in both this cover letter
and each individual patch, since majority of x86 part patches already got
Acked-by from Dave and Jarkko. And the changelogs are not quite useful from
my perspective.

KVM SGX virtualization Overview

- Virtual EPC

SGX enclave memory is special and is reserved specifically for enclave use.
In bare-metal SGX enclaves, the kernel allocates enclave pages, copies data
into the pages with privileged instructions, then allows the enclave to start.
In this scenario, only initialized pages already assigned to an enclave are
mapped to userspace.

In virtualized environments, the hypervisor still needs to do the physical
enclave page allocation. The guest kernel is responsible for the data copying
(among other things). This means that the job of starting an enclave is now
split between hypervisor and guest.

This series introduces a new misc device: /dev/sgx_vepc. This device allows
the host to map *uninitialized* enclave memory into userspace, which can then
be passed into a guest.

While it might be *possible* to start a host-side enclave with /dev/sgx_enclave
and pass its memory into a guest, it would be wasteful and convoluted.

Implement the *raw* EPC allocation in the x86 core-SGX subsystem via
/dev/sgx_vepc rather than in KVM. Doing so has two major advantages:

- Does not require changes to KVM's uAPI, e.g. EPC gets handled as
just another memory backend for guests.

- EPC management is wholly contained in the SGX subsystem, e.g. SGX
does not have to export any symbols, changes to reclaim flows don't
need to be routed through KVM, SGX's dirty laundry doesn't have to
get aired out for the world to see, and so on and so forth.

The virtual EPC pages allocated to guests are currently not reclaimable.
Reclaiming EPC page used by enclave requires a special reclaim mechanism
separate from normal page reclaim, and that mechanism is not supported
for virutal EPC pages. Due to the complications of handling reclaim
conflicts between guest and host, reclaiming virtual EPC pages is
significantly more complex than basic support for SGX virtualization.

- Support SGX virtualization without SGX Flexible Launch Control

SGX hardware supports two "launch control" modes to limit which enclaves can
run. In the "locked" mode, the hardware prevents enclaves from running unless
they are blessed by a third party. In the unlocked mode, the kernel is in
full control of which enclaves can run. The bare-metal SGX code refuses to
launch enclaves unless it is in the unlocked mode.

This sgx_virt_epc driver does not have such a restriction. This allows guests
which are OK with the locked mode to use SGX, even if the host kernel refuses

- Support exposing SGX2

Due to the same reason above, SGX2 feature detection is added to core SGX code
to allow KVM to expose SGX2 to guest, even currently SGX driver doesn't support
SGX2, because SGX2 can work just fine in guest w/o any interaction to host SGX

- Restricit SGX guest access to provisioning key

To grant guest being able to fully use SGX, guest needs to be able to access
provisioning key. The provisioning key is sensitive, and accessing to it should
be restricted. In bare-metal driver, allowing enclave to access provisioning key
is restricted by being able to open /dev/sgx_provision.

Add a new KVM_CAP_SGX_ATTRIBUTE to KVM uAPI to extend above mechanism to KVM
guests as well. When userspace hypervisor creates a new VM, the new cap is only
added to VM when userspace hypervisior is able to open /dev/sgx_provision,
following the same role as in bare-metal driver. KVM then traps ECREATE from
guest, and only allows ECREATE with provisioning key bit to run when guest

Kai Huang (4):
x86/cpufeatures: Make SGX_LC feature bit depend on SGX bit
x86/sgx: Wipe out EREMOVE from sgx_free_epc_page()
x86/sgx: Initialize virtual EPC driver even when SGX driver is
x86/sgx: Add helper to update SGX_LEPUBKEYHASHn MSRs

Sean Christopherson (21):
x86/cpufeatures: Add SGX1 and SGX2 sub-features
x86/sgx: Add SGX_CHILD_PRESENT hardware error code
x86/sgx: Introduce virtual EPC for use by KVM guests
x86/cpu/intel: Allow SGX virtualization without Launch Control support
x86/sgx: Expose SGX architectural definitions to the kernel
x86/sgx: Move ENCLS leaf definitions to sgx.h
x86/sgx: Add SGX2 ENCLS leaf definitions (EAUG, EMODPR and EMODT)
x86/sgx: Add encls_faulted() helper
x86/sgx: Add helpers to expose ECREATE and EINIT to KVM
x86/sgx: Move provisioning device creation out of SGX driver
KVM: x86: Export kvm_mmu_gva_to_gpa_{read,write}() for SGX (VMX)
KVM: x86: Define new #PF SGX error code bit
KVM: x86: Add support for reverse CPUID lookup of scattered features
KVM: x86: Add reverse-CPUID lookup support for scattered SGX features
KVM: VMX: Add basic handling of VM-Exit from SGX enclave
KVM: VMX: Frame in ENCLS handler for SGX virtualization
KVM: VMX: Add SGX ENCLS[ECREATE] handler to enforce CPUID restrictions
KVM: VMX: Add emulation of SGX Launch Control LE hash MSRs
KVM: VMX: Add ENCLS[EINIT] handler to support SGX Launch Control (LC)
KVM: VMX: Enable SGX virtualization for SGX1, SGX2 and LC
KVM: x86: Add capability to grant VM access to privileged SGX

Documentation/virt/kvm/api.rst | 23 +
arch/x86/Kconfig | 12 +
arch/x86/include/asm/cpufeatures.h | 2 +
arch/x86/include/asm/kvm_host.h | 5 +
.../cpu/sgx/arch.h => include/asm/sgx.h} | 50 +-
arch/x86/include/asm/vmx.h | 1 +
arch/x86/include/uapi/asm/vmx.h | 1 +
arch/x86/kernel/cpu/cpuid-deps.c | 3 +
arch/x86/kernel/cpu/feat_ctl.c | 71 ++-
arch/x86/kernel/cpu/scattered.c | 2 +
arch/x86/kernel/cpu/sgx/Makefile | 1 +
arch/x86/kernel/cpu/sgx/driver.c | 17 -
arch/x86/kernel/cpu/sgx/encl.c | 42 +-
arch/x86/kernel/cpu/sgx/encl.h | 1 +
arch/x86/kernel/cpu/sgx/encls.h | 30 +-
arch/x86/kernel/cpu/sgx/ioctl.c | 29 +-
arch/x86/kernel/cpu/sgx/main.c | 96 +++-
arch/x86/kernel/cpu/sgx/sgx.h | 13 +-
arch/x86/kernel/cpu/sgx/virt.c | 369 ++++++++++++++
arch/x86/kvm/Makefile | 2 +
arch/x86/kvm/cpuid.c | 89 +++-
arch/x86/kvm/cpuid.h | 50 +-
arch/x86/kvm/vmx/nested.c | 28 +-
arch/x86/kvm/vmx/nested.h | 5 +
arch/x86/kvm/vmx/sgx.c | 481 ++++++++++++++++++
arch/x86/kvm/vmx/sgx.h | 34 ++
arch/x86/kvm/vmx/vmcs12.c | 1 +
arch/x86/kvm/vmx/vmcs12.h | 4 +-
arch/x86/kvm/vmx/vmx.c | 109 +++-
arch/x86/kvm/vmx/vmx.h | 2 +
arch/x86/kvm/x86.c | 23 +
include/uapi/linux/kvm.h | 1 +
tools/testing/selftests/sgx/defines.h | 2 +-
34 files changed, 1476 insertions(+), 124 deletions(-)
rename arch/x86/{kernel/cpu/sgx/arch.h => include/asm/sgx.h} (89%)
create mode 100644 arch/x86/kernel/cpu/sgx/virt.c
create mode 100644 arch/x86/kvm/vmx/sgx.c
create mode 100644 arch/x86/kvm/vmx/sgx.h