[PATCH V8 10/44] Documentation/pkeys: Add initial PKS documentation

From: ira . weiny
Date: Thu Jan 27 2022 - 12:56:36 EST


From: Ira Weiny <ira.weiny@xxxxxxxxx>

Add initial overview and configuration information about PKS.

Signed-off-by: Ira Weiny <ira.weiny@xxxxxxxxx>
---
Documentation/core-api/protection-keys.rst | 57 ++++++++++++++++++++--
1 file changed, 53 insertions(+), 4 deletions(-)

diff --git a/Documentation/core-api/protection-keys.rst b/Documentation/core-api/protection-keys.rst
index 12331db474aa..58670e3ee39e 100644
--- a/Documentation/core-api/protection-keys.rst
+++ b/Documentation/core-api/protection-keys.rst
@@ -12,6 +12,9 @@ PKeys Userspace (PKU) is a feature which is found on Intel's Skylake "Scalable
Processor" Server CPUs and later. And it will be available in future
non-server Intel parts and future AMD processors.

+Protection Keys for Supervisor pages (PKS) is available in the SDM since May
+2020.
+
pkeys work by dedicating 4 previously Reserved bits in each page table entry to
a "protection key", giving 16 possible keys.

@@ -22,13 +25,20 @@ and Write Disable) for each of 16 keys.
Being a CPU register, PKRU is inherently thread-local, potentially giving each
thread a different set of protections from every other thread.

-There are two instructions (RDPKRU/WRPKRU) for reading and writing to the
-register. The feature is only available in 64-bit mode, even though there is
+For Userspace (PKU), there are two instructions (RDPKRU/WRPKRU) for reading and
+writing to the register.
+
+For Supervisor (PKS), the register (MSR_IA32_PKRS) is accessible only to the
+kernel through rdmsr and wrmsr.
+
+The feature is only available in 64-bit mode, even though there is
theoretically space in the PAE PTEs. These permissions are enforced on data
access only and have no effect on instruction fetches.

-Syscalls
-========
+
+
+Syscalls for user space keys
+============================

There are 3 system calls which directly interact with pkeys::

@@ -95,3 +105,42 @@ with a read()::
The kernel will send a SIGSEGV in both cases, but si_code will be set
to SEGV_PKERR when violating protection keys versus SEGV_ACCERR when
the plain mprotect() permissions are violated.
+
+
+Kernel API for PKS support
+==========================
+
+Overview
+--------
+
+Similar to user space pkeys, supervisor pkeys allow additional protections to
+be defined for a supervisor mappings. Unlike user space pkeys, violations of
+these protections result in a kernel oops.
+
+Supervisor Memory Protection Keys (PKS) is a feature which is found on Intel's
+Sapphire Rapids (and later) "Scalable Processor" Server CPUs. It will also be
+available in future non-server Intel parts.
+
+Also qemu has support as well: https://www.qemu.org/2021/04/30/qemu-6-0-0/
+
+Kconfig
+-------
+Kernel users intending to use PKS support should depend on
+ARCH_HAS_SUPERVISOR_PKEYS, and select ARCH_ENABLE_SUPERVISOR_PKEYS to turn on
+this support within the core.
+
+
+MSR details
+-----------
+
+It should be noted that the underlying WRMSR(MSR_IA32_PKRS) is not serializing
+but still maintains ordering properties similar to WRPKRU.
+
+Older versions of the SDM on PKRS may be wrong with regard to this
+serialization. The text should be the same as that of WRPKRU. From the WRPKRU
+text:
+
+ WRPKRU will never execute transiently. Memory accesses
+ affected by PKRU register will not execute (even transiently)
+ until all prior executions of WRPKRU have completed execution
+ and updated the PKRU register.
--
2.31.1