[PATCH v11 12/13] intel_sgx: driver documentation
From: Jarkko Sakkinen
Date: Fri Jun 08 2018 - 13:22:26 EST
Documentation of the features of the Software Guard eXtensions usable
for the Linux kernel and how the driver internals uses these features.
In addition, contains documentation for the ioctl API.
Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@xxxxxxxxxxxxxxx>
Documentation/index.rst | 1 +
Documentation/x86/intel_sgx.rst | 195 ++++++++++++++++++++++++++++++++
2 files changed, 196 insertions(+)
create mode 100644 Documentation/x86/intel_sgx.rst
diff --git a/Documentation/index.rst b/Documentation/index.rst
index 3b99ab931d41..b9fb92928e8c 100644
@@ -100,6 +100,7 @@ implementation.
diff --git a/Documentation/x86/intel_sgx.rst b/Documentation/x86/intel_sgx.rst
new file mode 100644
@@ -0,0 +1,195 @@
+Intel(R) SGX driver
+Intel(R) SGX is a set of CPU instructions that can be used by applications to
+set aside private regions of code and data. The code outside the enclave is
+disallowed to access the memory inside the enclave by the CPU access control.
+In a way you can think that SGX provides inverted sandbox. It protects the
+application from a malicious host.
+You can tell if your CPU supports SGX by looking into ``/proc/cpuinfo``:
+ ``cat /proc/cpuinfo | grep sgx``
+Overview of SGX
+SGX has a set of data structures to maintain information about the enclaves and
+their security properties. BIOS reserves a fixed size region of physical memory
+for these structures by setting Processor Reserved Memory Range Registers
+This memory range is protected from outside access by the CPU and all the data
+coming in and out of the CPU package is encrypted by a key that is generated for
+each boot cycle.
+Enclaves execute in ring-3 in a special enclave submode using pages from the
+reserved memory range. A fixed logical address range for the enclave is reserved
+by ENCLS(ECREATE), a leaf instruction used to create enclaves. It is referred in
+the documentation commonly as the ELRANGE.
+Every memory access to the ELRANGE is asserted by the CPU. If the CPU is not
+executing in the enclave mode inside the enclave, #GP is raised. On the other
+hand enclave code can make memory accesses both inside and outside of the
+Enclave can only execute code inside the ELRANGE. Instructions that may cause
+VMEXIT, IO instructions and instructions that require a privilege change are
+prohibited inside the enclave. Interrupts and exceptions always cause enclave
+to exit and jump to an address outside the enclave given when the enclave is
+entered by using the leaf instruction ENCLS(EENTER).
+The protected memory range contains the following data:
+* **Enclave Page Cache (EPC):** protected pages
+* **Enclave Page Cache Map (EPCM):** a database that describes the state of the
+ pages and link them to an enclave.
+EPC has a number of different types of pages:
+* **SGX Enclave Control Structure (SECS)**: describes the global
+ properties of an enclave.
+* **Regular (REG):** code and data pages in the ELRANGE.
+* **Thread Control Structure (TCS):** pages that define entry points inside an
+ enclave. The enclave can only be entered through these entry points and each
+ can host a single hardware thread at a time.
+* **Version Array (VA)**: 64-bit version numbers for pages that have been
+ swapped outside the enclave. Each page contains 512 version numbers.
+To launch an enclave, two structures must be provided for ENCLS(EINIT):
+1. **SIGSTRUCT:** signed measurement of the enclave binary.
+2. **EINITTOKEN:** a cryptographic token CMAC-signed with a AES256-key called
+ *launch key*, which is re-generated for each boot cycle.
+The CPU holds a SHA256 hash of a 3072-bit RSA public key inside
+IA32_SGXLEPUBKEYHASHn MSRs. Enclaves with a SIGSTRUCT that is signed with this
+key do not require a valid EINITTOKEN and can be authorized with special
+privileges. One of those privileges is ability to acquire the launch key with
+**IA32_FEATURE_CONTROL** is used by to BIOS configure whether
+IA32_SGXLEPUBKEYHASH MSRs are read-only or read-write before locking the
+feature control register and handing over control to the operating system.
+The construction is started by filling out the SECS that contains enclave
+address range, privileged attributes and measurement of TCS and REG pages (pages
+that will be mapped to the address range) among the other things. This structure
+is passed out to the ENCLS(ECREATE) together with a physical address of a page
+in EPC that will hold the SECS.
+Then pages are added with ENCLS(EADD) and measured with ENCLS(EEXTEND). Finally
+enclave is initialized with ENCLS(EINIT). ENCLS(INIT) checks that the SIGSTRUCT
+is signed with the contained public key and that the supplied EINITTOKEN is
+valid (CMAC'd with the launch key). If these hold, the enclave is successfully
+Enclave pages can be swapped out with ENCLS(EWB) to the unprotected memory. In
+addition to the EPC page, ENCLS(EWB) takes in a VA page and address for PCMD
+structure (Page Crypto MetaData) as input. The VA page will seal a version
+number for the page. PCMD is 128 byte structure that contains tracking
+information for the page, most importantly its MAC. With these structures the
+enclave is sealed and rollback protected while it resides in the unprotected
+Before the page can be swapped out it must not have any active TLB references.
+By using ENCLS(EBLOCK) instructions no new TLB entries can be created to it.
+After this the a counter called *epoch* associated hardware threads inside the
+enclave is increased with ENCLS(ETRACK). After all the threads from the previous
+epoch have exited the page can be safely swapped out.
+An enclave memory access to a swapped out pages will cause #PF. #PF handler can
+fault the page back by using ENCLS(ELDU).
+Because SGX has an ever evolving and expanding feature set, it's possible for
+a BIOS or VMM to configure a system in such a way that not all cpus are equal,
+e.g. where Launch Control is only enabled on a subset of cpus. Linux does
+*not* support such a heterogenous system configuration, nor does it even
+attempt to play nice in the face of a misconfigured system. With the exception
+of Launch Control's hash MSRs, which can vary per cpu, Linux assumes that all
+cpus have a configuration that is identical to the boot cpu.
+Roles and responsibilities
+SGX introduces system resources, e.g. EPC memory, that must be accessible to
+multiple entities, e.g. the native kernel driver (to expose SGX to userspace)
+and KVM (to expose SGX to VMs), ideally without introducing any dependencies
+between each SGX entity. To that end, the kernel owns and manages the shared
+system resources, i.e. the EPC and Launch Control MSRs, and defines functions
+that provide appropriate access to the shared resources. SGX support for
+userpace and VMs is left to the SGX platform driver and KVM respectively.
+For privileged enclaves the launch is performed simply by submitting the
+SIGSTRUCT for that enclave to ENCLS(EINIT). For unprivileged enclaves the
+driver hosts a process in ring-3 that hosts a launch enclave signed with a key
+supplied for kbuild.
+The current implementation of the launch enclave generates a token for any
+enclave. In the future it could be potentially extended to have ways to
+configure policy what can be lauched.
+The driver will fail to initialize if it cannot start its own launch enclave.
+A user space application can submit a SIGSTRUCT instance through the ioctl API.
+The kernel will take care of the rest.
+This design assures that the Linux kernel has always full control, which
+enclaves get to launch and which do not, even if the public key MSRs are
+read-only. Having launch intrinsics inside the kernel also enables easy
+development of enclaves without necessarily needing any heavy weight SDK.
+Having a low-barrier to implement enclaves could make sense for example for
+system daemons where amount of dependecies ought to be minimized.
+Due to the unique requirements for swapping EPC pages, and because EPC pages
+(currently) do not have associated page structures, management of the EPC is
+not handled by the standard Linux swapper. SGX directly handles swapping
+of EPC pages, including a kthread to initiate reclaim and a rudimentary LRU
+mechanism. Consumsers of EPC pages, e.g. the SGX driver, are required to
+implement function callbacks that can be invoked by the kernel to age,
+swap, and/or forcefully reclaim a target EPC page. In effect, the kernel
+controls what happens and when, while the consumers (driver, KVM, etc..) do
+the actual work.
+.. kernel-doc:: drivers/platform/x86/intel_sgx/sgx_ioctl.c
+ :functions: sgx_ioc_enclave_create
+.. kernel-doc:: arch/x86/include/uapi/asm/sgx.h
+* System Programming Manual: 39.1.4 Intel® SGX Launch Control Configuration