Re: [PATCH RFC 00/10] Intel EPT-Based Sub-page Write Protection Support.

From: Paolo Bonzini
Date: Fri Oct 13 2017 - 17:13:32 EST



> I'll ask before Paolo does: Can you please add kvm-unit-tests to
> exercise all of this new code?

More specifically it should be the api/ unit tests because this code
can only be triggered by specific code in the host.

However, as things stand I'm not sure about how userspace would use it.
Only allowing blocking of writes means that we cannot (for example) use
it to do sub-page passthrough in VFIO. That would be useful when the
MSI-X table does not fit a full page, but would require blocking reads
as well. And the introspection facility by Mihai uses a completely
different API for the introspector, based on sockets rather than ioctls.
So I'm not sure this is the right API at all.

Paolo

> BTW, what generation of hardware do we need to exercise this code ourselves?
>
> On Fri, Oct 13, 2017 at 4:11 PM, Zhang Yi <yi.z.zhang@xxxxxxxxxxxxxxx> wrote:
> > From: Zhang Yi Z <yi.z.zhang@xxxxxxxxxxxxxxx>
> >
> > Hi All,
> >
> > Here is a patch-series which adding EPT-Based Sub-page Write Protection
> > Support. You can get It's software developer manuals from:
> >
> > https://software.intel.com/sites/default/files/managed/c5/15/architecture-instruction-set-extensions-programming-reference.pdf
> >
> > In Chapter 4 EPT-BASED SUB-PAGE PERMISSIONS.
> >
> > Introduction:
> >
> > EPT-Based Sub-page Write Protection referred to as SPP, it is a capability
> > which allow Virtual Machine Monitors(VMM) to specify write-permission for
> > guest physical memory at a sub-page(128 byte) granularity. When this
> > capability is utilized, the CPU enforces write-access permissions for
> > sub-page regions of 4K pages as specified by the VMM. EPT-based sub-page
> > permissions is intended to enable fine-grained memory write enforcement by
> > a VMM for security(guest OS monitoring) and usages such as device
> > virtualization and memory check-point.
> >
> > How SPP Works:
> >
> > SPP is active when the "sub-page write protection" VM-execution control is
> > 1. A new 4-level paging structure named SPP page table(SPPT) is
> > introduced, SPPT will look up the guest physical addresses to derive a 64
> > bit "sub-page permission" value containing sub-page write permissions. The
> > lookup from guest-physical addresses to the sub-page region permissions is
> > determined by a set of this SPPT paging structures.
> >
> > The SPPT is used to lookup write permission bits for the 128 byte sub-page
> > regions containing in the 4KB guest physical page. EPT specifies the 4KB
> > page level privileges that software is allowed when accessing the guest
> > physical address, whereas SPPT defines the write permissions for software
> > at the 128 byte granularity regions within a 4KB page. Write accesses
> > prevented due to sub-page permissions looked up via SPPT are reported as
> > EPT violation VM exits. Similar to EPT, a logical processor uses SPPT to
> > lookup sub-page region write permissions for guest-physical addresses only
> > when those addresses are used to access memory.
> >
> > Guest write access --> GPA --> Walk EPT --> EPT leaf entry -â
> > â-----------------------------------------------------------â
> > â-> if VMexec_control.spp && ept_leaf_entry.spp_bit (bit 61)
> > |
> > â-> <false> --> EPT legacy behavior
> > |
> > |
> > â-> <true> --> if ept_leaf_entry.writable
> > |
> > â-> <true> --> Ignore SPP
> > |
> > â-> <false> --> GPA --> Walk SPP 4-level table--â
> > |
> > â------------<----------get-the-SPPT-point-from-VMCS-filed-----<------â
> > |
> > Walk SPP L4E table
> > |
> > ââ--> entry misconfiguration ------------>----------â<----------------â
> > | | |
> > else | |
> > | | |
> > | â------------------SPP VMexit<-----------------â |
> > | | |
> > | â-> exit_qualification & sppt_misconfig --> sppt misconfig |
> > | | |
> > | â-> exit_qualification & sppt_miss --> sppt miss |
> > â--â |
> > | |
> > walk SPPT L3E--â--> if-entry-misconfiguration------------>------------â
> > | |
> > else |
> > | |
> > | |
> > walk SPPT L2E --â--> if-entry-misconfiguration-------->-------â
> > | |
> > else |
> > | |
> > | |
> > walk SPPT L1E --â-> if-entry-misconfiguration--->----â
> > |
> > else
> > |
> > â-> if sub-page writable
> > â-> <true> allow, write access
> > â-> <false> disallow, EPT violation
> >
> > Patch-sets Description:
> >
> > Patch 1: Documentation.
> >
> > Patch 2: This patch adds reporting SPP capability from VMX Procbased MSR,
> > according to the definition of hardware spec, bit 23 is the control of the
> > SPP capability.
> >
> > Patch 3: Add new secondary processor-based VM-execution control bit which
> > defined as "sub-page write permission", same as VMX Procbased MSR, bit 23
> > is the enable bit of SPP.
> > Also we introduced a kernel parameter "enable_ept_spp", now SPP is active
> > when the "Sub-page Write Protection" in Secondary VM-Execution Control is
> > set and enable the kernel parameter by "enable_ept_spp=1".
> >
> > Patch 4: Introduced the spptp and spp page table.
> > The sub-page permission table is referenced via a 64-bit control field
> > called Sub-Page Permission Table Pointer (SPPTP) which contains a
> > 4K-aligned physical address. The index and encoding for this VMCS field if
> > defined 0x2030 at this time The format of SPPTP is shown in below figure
> > 2:
> > this patch introduced the Spp paging structures, which root page will
> > created at kvm mmu page initialization.
> > Also we added a mmu page role type spp to distinguish it is a spp page or a
> > EPT page.
> >
> > Patch 5: Introduced the SPP-Induced VM exit and it's handle.
> > Accesses using guest-physical addresses may cause SPP-induced VM exits due
> > to an SPPT misconfiguration or an SPPT miss. The basic VM exit reason code
> > reporte for SPP-induced VM exits is 66.
> >
> > Also introduced the new exit qualification for SPPT-induced vmexits.
> >
> > | Bit | Contents
> > | |
> > | :---- | :----------------------------------------------------------------
> > | |
> > | 10:0 | Reserved (0).
> > | |
> > | 11 | SPPT VM exit type. Set for SPPT Miss, cleared for SPPT Misconfig.
> > | |
> > | 12 | NMI unblocking due to IRET
> > | |
> > | 63:13 | Reserved (0)
> > | |
> >
> > Patch 6: Added a handle of EPT subpage write protection fault.
> > A control bit in EPT leaf paging-structure entries is defined as âSub-Page
> > Permissionâ (SPP bit). The bit position is 61; it is chosen from among the
> > bits that are currently ignored by the processor and available to
> > software.
> > While hardware walking the SPP page table, If the sub-page region write
> > permission bit is set, the write is allowed, else the write is disallowed
> > and results in an EPT violation.
> > We need peek this case in EPT violation handler, and trigger a user-space
> > exit, return the write protected address(GVA) to user(qemu).
> >
> > Patch 7: Introduce ioctls to set/get Sub-Page Write Protection.
> > We introduced 2 ioctls to let user application to set/get subpage write
> > protection bitmap per gfn, each gfn corresponds to a bitmap.
> > The user application, qemu, or some other security control daemon. will set
> > the protection bitmap via this ioctl.
> > the API defined as:
> > struct kvm_subpage {
> > __u64 base_gfn;
> > __u64 npages;
> > /* sub-page write-access bitmap array */
> > __u32 access_map[SUBPAGE_MAX_BITMAP];
> > }sp;
> > kvm_vm_ioctl(s, KVM_SUBPAGES_SET_ACCESS, &sp)
> > kvm_vm_ioctl(s, KVM_SUBPAGES_GET_ACCESS, &sp)
> >
> > Patch 8 ~ Patch 9: Setup spp page table and update the EPT leaf entry
> > indicated with the SPP enable bit.
> > If the sub-page write permission VM-execution control is set, treatment of
> > write accesses to guest-physical accesses depends on the state of the
> > accumulated write-access bit (position 1) and sub-page permission bit
> > (position 61) in the EPT leaf paging-structure.
> > Software will update the EPT leaf entry sub-page permission bit while
> > kvm_set_subpage(patch 7). If the EPT write-access bit set to 0 and the SPP
> > bit set to 1 in the leaf EPT paging-structure entry that maps a 4KB page,
> > then the hardware will look up a VMM-managed Sub-Page Permission Table
> > (SPPT), which will be prepared by setup kvm_set_subpage(patch 8).
> > The hardware uses the guest-physical address and bits 11:7 of the address
> > accessed to lookup the SPPT to fetch a write permission bit for the 128
> > byte wide sub-page region being accessed within the 4K guest-physical
> > page. If the sub-page region write permission bit is set, the write is
> > allowed, otherwise the write is disallowed and results in an EPT
> > violation.
> > Guest-physical pages mapped via leaf EPT-paging-structures for which the
> > accumulated write-access bit and the SPP bits are both clear (0) generate
> > EPT violations on memory writes accesses. Guest-physical pages mapped via
> > EPT-paging-structure for which the accumulated write-access bit is set (1)
> > allow writes, effectively ignoring the SPP bit on the leaf EPT-paging
> > structure.
> > Software will setup the spp page table level4,3,2 as well as EPT page
> > structure, and fill the level 1 page via the 32 bit bitmaps per a single
> > 4K page. Now it could be divided to 32 x 128 sub-pages.
> >
> > The SPP L4E L3E L2E is defined as below figure.
> >
> > | Bit | Contents
> > | |
> > | :----- |
> > | :--------------------------------------------------------------------- |
> > | 0 | Valid entry when set; indicates whether the entry is present
> > | |
> > | 11:1 | Reserved (0)
> > | |
> > | N-1:12 | Physical address of 4K aligned SPPT LX-1 Table referenced by the
> > | entry |
> > | 51:N | Reserved (0)
> > | |
> > | 63:52 | Reserved (0)
> > | |
> > Note: N is the physical address width supported by the processor, X is the
> > page level
> >
> > The SPP L1E format is defined as below figure.
> > | Bit | Contents
> > | |
> > | :---- | :----------------------------------------------------------------
> > | |
> > | 0+2i | Write permission for i-th 128 byte sub-page region.
> > | |
> > | 1+2i | Reserved (0).
> > | |
> > Note: `0<=i<=31`
> >
> >
> > Zhang Yi Z (10):
> > KVM: VMX: Added EPT Subpage Protection Documentation.
> > x86/cpufeature: Add intel Sub-Page Protection to CPU features
> > KVM: VMX: Added VMX SPP feature flags and VM-Execution Controls.
> > KVM: VMX: Introduce the SPPTP and SPP page table.
> > KVM: VMX: Introduce SPP-Induced vm exit and it's handle.
> > KVM: VMX: Added handle of SPP write protection fault.
> > KVM: VMX: Introduce ioctls to set/get Sub-Page Write Protection.
> > KVM: VMX: Update the EPT leaf entry indicated with the SPP enable bit.
> > KVM: VMX: Added setup spp page structure.
> > KVM: VMX: implement setup SPP page structure in spp miss.
> >
> > Documentation/virtual/kvm/spp_design_kvm.txt | 272 +++++++++++++++++++++
> > arch/x86/include/asm/cpufeatures.h | 1 +
> > arch/x86/include/asm/kvm_host.h | 18 +-
> > arch/x86/include/asm/vmx.h | 10 +
> > arch/x86/include/uapi/asm/vmx.h | 2 +
> > arch/x86/kernel/cpu/intel.c | 4 +
> > arch/x86/kvm/mmu.c | 340
> > ++++++++++++++++++++++++++-
> > arch/x86/kvm/mmu.h | 1 +
> > arch/x86/kvm/vmx.c | 104 ++++++++
> > arch/x86/kvm/x86.c | 99 +++++++-
> > include/linux/kvm_host.h | 5 +
> > include/uapi/linux/kvm.h | 16 ++
> > virt/kvm/kvm_main.c | 26 ++
> > 13 files changed, 893 insertions(+), 5 deletions(-)
> > create mode 100644 Documentation/virtual/kvm/spp_design_kvm.txt
> >
> > --
> > 2.7.4
> >
>