[RFC PATCH V2 00/11] Intel EPT-Based Sub-page Protection Support

From: Zhang Yi
Date: Fri Nov 30 2018 - 02:52:33 EST


Here is a patch-series which adding EPT-Based Sub-page Write Protection Support.

Introduction:

EPT-Based Sub-page Write Protection referred to as SPP, it is a capability which
allow Virtual Machine Monitors(VMM) to specify write-permission for guest
physical memory at a sub-page(128 byte) granularity. When this capability is
utilized, the CPU enforces write-access permissions for sub-page regions of 4K
pages as specified by the VMM. EPT-based sub-page permissions is intended to
enable fine-grained memory write enforcement by a VMM for security(guest OS
monitoring) and usages such as device virtualization and memory check-point.

SPPT is active when the "sub-page write protection" VM-execution control is 1.
SPPT looks up the guest physical addresses to derive a 64 bit "sub-page
permission" value containing sub-page write permissions. The lookup from
guest-physical addresses to the sub-page region permissions is determined by a
set of SPPT paging structures.

When the "sub-page write protection" VM-execution control is 1, the SPPT is used
to lookup write permission bits for the 128 byte sub-page regions containing in
the 4KB guest physical page. EPT specifies the 4KB page level privileges that
software is allowed when accessing the guest physical address, whereas SPPT
defines the write permissions for software at the 128 byte granularity regions
within a 4KB page. Write accesses prevented due to sub-page permissions looked
up via SPPT are reported as EPT violation VM exits. Similar to EPT, a logical
processor uses SPPT to lookup sub-page region write permissions for
guest-physical addresses only when those addresses are used to access memory.

______________________________________________________________________________

How SPP hardware works:
______________________________________________________________________________


Guest write access --> GPA --> Walk EPT --> EPT leaf entry -â
â-----------------------------------------------------------â
â-> if VMexec_control.spp && ept_leaf_entry.spp_bit (bit 61)
|
â-> <false> --> EPT legacy behavior
|
|
â-> <true> --> if ept_leaf_entry.writable
|
â-> <true> --> Ignore SPP
|
â-> <false> --> GPA --> Walk SPP 4-level table--â
|
â------------<----------get-the-SPPT-point-from-VMCS-filed-----<------â
|
Walk SPP L4E table
|
ââ--> entry misconfiguration ------------>----------â<----------------â
| | |
else | |
| | |
| â------------------SPP VMexit<-----------------â |
| | |
| â-> exit_qualification & sppt_misconfig --> sppt misconfig |
| | |
| â-> exit_qualification & sppt_miss --> sppt miss |
â--â |
| |
walk SPPT L3E--â--> if-entry-misconfiguration------------>------------â
| |
else |
| |
| |
walk SPPT L2E --â--> if-entry-misconfiguration-------->-------â
| |
else |
| |
| |
walk SPPT L1E --â-> if-entry-misconfiguration--->----â
|
else
|
â-> if sub-page writable
â-> <true> allow, write access
â-> <false> disallow, EPT violation

Patch description:

Patch 1: The design Doc of EPT-Based Sub-page Write Protection(SPP)

Patch 2: this patch adds reporting SPP capability from VMX Procbased MSR,
according to the definition of hardware spec, bit 23 is the control of the SPP
capability.

Patch 3: Add new secondary processor-based VM-execution control bit which
defined as "sub-page write permission", same as VMX Procbased MSR, bit 23 is
the enable bit of SPP. Also we introduced a kernel parameter "enable_ept_spp",
now SPP is active when the "Sub-page Write Protection" in Secondary VM-Execution
Control is set and enable the kernel parameter by "spp=1".

Patch 4: Introduced the spptp and spp page table.
The sub-page permission table is referenced via a 64-bit control field called
Sub-Page Permission Table Pointer (SPPTP) which contains a 4K-aligned physical
address. The index and encoding for this VMCS field if defined 0x2030 at this
time The format of SPPTP is shown in below figure:

---------------------------------------------------------------|
| Bit | Contents |
:--------------------------------------------------------------|
| 11:0 | Reserved (0) |
| N-1:12 | Physical address of 4KB aligned SPPT L4E Table |
| 51:N | Reserved (0) |
| 63:52 | Reserved (0) |
---------------------------------------------------------------|

This patch introduced the Spp paging structures, which root page will created at
kvm mmu page initialization. Also we added a mmu page role type spp to distinguish
it is a spp page or a EPT page.

Patch 5: Defined SPPTP in new VMCS area, then we write the SPPTP to vmcs.

Patch 6: Introduced the SPP-Induced VM exit and it's handle.
Accesses using guest-physical addresses may cause SPP-induced VM exits due to an
SPPT misconfiguration or an SPPT miss. The basic VM exit reason code reported for
SPP-induced VM exits is 66.

Also Introduced the below exit qualification for SPPT-induced vmexits.

| Bit | Contents |
| :---- | :---------------------------------------------------------------- |
| 10:0 | Reserved (0). |
| 11 | SPPT VM exit type. Set for SPPT Miss, cleared for SPPT Misconfig. |
| 12 | NMI unblocking due to IRET |
| 63:13 | Reserved (0) |

Patch 7: Added a handle of EPT subpage write protection fault.
A control bit in EPT leaf paging-structure entries is defined as Sub-Page
Permission (SPP bit). The bit position is 61; it is chosen from among the bits
that are currently ignored by the processor and available to software.

While hardware walking the SPP page table, If the sub-page region write
permission bit is set, the write is allowed, else the write is disallowed and
results in an EPT violation.

We need peek this case in EPT volition handler, and trigger a user-space exit,
return the write protected address(GVA) to user(qemu).

Patch 8: Introduce ioctls to set/get Sub-Page Write Protection.
We introduced 2 ioctls to let user application to set/get subpage write
protection bitmap per gfn, each gfn corresponds to a bitmap.
The user application, qemu, or some other security control daemon, will set the
protection bitmap via this ioctl.

the API defined as:
struct kvm_subpage {
__u64 base_gfn;
__u64 npages;
/* sub-page write-access bitmap array */
__u32 access_map[SUBPAGE_MAX_BITMAP];
}sp;
kvm_vm_ioctl(s, KVM_SUBPAGES_SET_ACCESS, &sp)
kvm_vm_ioctl(s, KVM_SUBPAGES_GET_ACCESS, &sp)

Patch 9 ~ Patch 11: Setup spp page table and update the EPT leaf entry indicated
with the SPP enable bit. If the sub-page write permission VM-execution control
is set, treatment of write accesses to guest-physical accesses depends on the
state of the accumulated write-access bit (position 1) and sub-page permission
bit(position 61) in the EPT leaf paging-structure.

Software will update the EPT leaf entry sub-page permission bit while
kvm_set_subpage(patch 7). If the EPT write-access bit set to 0 and the SPP bit
set to 1 in the leaf EPT paging-structure entry that maps a 4KB page, then the
hardware will look up a VMM-managed Sub-Page Permission Table (SPPT), which
will be prepared by setup kvm_set_subpage(patch 8).

The hardware uses the guest-physical address and bits 11:7 of the address
accessed to lookup the SPPT to fetch a write permission bit for the 128 byte
wide sub-page region being accessed within the 4K guest-physical page. If the
sub-page region write permission bit is set, the write is allowed, otherwise
the write is disallowed and results in an EPT violation.

Guest-physical pages mapped via leaf EPT-paging-structures for which the
accumulated write-access bit and the SPP bits are both clear (0) generate EPT
violations on memory writes accesses. Guest-physical pages mapped via
EPT-paging-structure for which the accumulated write-access bit is set (1) allow
writes, effectively ignoring the SPP bit on the leaf EPT-paging structure.
Software will setup the spp page table level4,3,2 as well as EPT page structure,
and fill the level1 via the 32 bit bitmaps per a single 4K page. Now it could be
divided to 32 x 128 sub-pages.

The SPP L4E L3E L2E is defined as below figure.
________________________________________________________________________________
| Bit | Contents |
| :----- | :-------------------------------------------------------------------|
| 0 | Valid entry when set; indicates whether the entry is present |
| 11:1 | Reserved (0) |
| N-1:12 | Physical address of 4K SPPT LX-1 Table referenced by the entry |
| 51:N | Reserved (0) |
| 63:52 | Reserved (0) |

Note: N is the physical address width supported by the processor, X is the page level

The SPP L1E format is defined as below figure.
____________________________________________________________________________
| Bit | Contents |
| :---- | :---------------------------------------------------------------- |
| 0+2i | Write permission for i-th 128 byte sub-page region. |
| 1+2i | Reserved (0). |
Note: `0<=i<=31`

Chang logs:
V2 - V1:
1. Rebased to 4.20-rc1
2. Move VMCS change to a separated patch.
3. Code refine and Bug fix

Zhang Yi (11):
Documentation: Added EPT Subpage Protection Documentation.
x86/cpufeature: Add intel Sub-Page Protection to CPU features
KVM: VMX: Added VMX SPP feature flags and VM-Execution Controls.
KVM: VMX: Introduce the SPPTP and SPP page table.
KVM: VMX: Write the SPPTP to VMCS area.
KVM: VMX: Introduce SPP-Induced vm exit and it's handle.
KVM: VMX: Added handle of SPP write protection fault.
KVM: VMX: Introduce ioctls to set/get Sub-Page Write Protection.
KVM: VMX: Update the EPT leaf entry indicated with the SPP enable bit.
KVM: VMX: Added setup spp page structure.
KVM: VMX: implement setup SPP page structure in spp miss.

Documentation/virtual/kvm/spp_design_kvm.txt | 275 ++++++++++++++++++++++
arch/x86/include/asm/cpufeatures.h | 1 +
arch/x86/include/asm/kvm_host.h | 19 +-
arch/x86/include/asm/vmx.h | 10 +
arch/x86/include/uapi/asm/vmx.h | 2 +
arch/x86/kernel/cpu/intel.c | 4 +
arch/x86/kvm/mmu.c | 334 ++++++++++++++++++++++++++-
arch/x86/kvm/mmu.h | 1 +
arch/x86/kvm/vmx.c | 105 +++++++++
arch/x86/kvm/x86.c | 124 +++++++++-
include/linux/kvm_host.h | 5 +
include/uapi/linux/kvm.h | 16 ++
12 files changed, 892 insertions(+), 4 deletions(-)
create mode 100644 Documentation/virtual/kvm/spp_design_kvm.txt

--
2.7.4