Re: [PATCH, RFC 00/62] Intel MKTME enabling

From: Mike Rapoport
Date: Wed May 29 2019 - 03:34:49 EST


On Wed, May 08, 2019 at 05:43:20PM +0300, Kirill A. Shutemov wrote:
> = Intro =
>
> The patchset brings enabling of Intel Multi-Key Total Memory Encryption.
> It consists of changes into multiple subsystems:
>
> * Core MM: infrastructure for allocation pages, dealing with encrypted VMAs
> and providing API setup encrypted mappings.
> * arch/x86: feature enumeration, program keys into hardware, setup
> page table entries for encrypted pages and more.
> * Key management service: setup and management of encryption keys.
> * DMA/IOMMU: dealing with encrypted memory on IO side.
> * KVM: interaction with virtualization side.
> * Documentation: description of APIs and usage examples.
>
> The patchset is huge. This submission aims to give view to the full picture and
> get feedback on the overall design. The patchset will be split into more
> digestible pieces later.
>
> Please review. Any feedback is welcome.

It would be nice to have a brief usage description in cover letter rather
than in the last patches in the series ;-)

> = Overview =
>
> Multi-Key Total Memory Encryption (MKTME)[1] is a technology that allows
> transparent memory encryption in upcoming Intel platforms. It uses a new
> instruction (PCONFIG) for key setup and selects a key for individual pages by
> repurposing physical address bits in the page tables.
>
> These patches add support for MKTME into the existing kernel keyring subsystem
> and add a new mprotect_encrypt() system call that can be used by applications
> to encrypt anonymous memory with keys obtained from the keyring.
>
> This architecture supports encrypting both normal, volatile DRAM and persistent
> memory. However, these patches do not implement persistent memory support. We
> anticipate adding that support next.
>
> == Hardware Background ==
>
> MKTME is built on top of an existing single-key technology called TME. TME
> encrypts all system memory using a single key generated by the CPU on every
> boot of the system. TME provides mitigation against physical attacks, such as
> physically removing a DIMM or watching memory bus traffic.
>
> MKTME enables the use of multiple encryption keys[2], allowing selection of the
> encryption key per-page using the page tables. Encryption keys are programmed
> into each memory controller and the same set of keys is available to all
> entities on the system with access to that memory (all cores, DMA engines,
> etc...).
>
> MKTME inherits many of the mitigations against hardware attacks from TME. Like
> TME, MKTME does not mitigate vulnerable or malicious operating systems or
> virtual machine managers. MKTME offers additional mitigations when compared to
> TME.
>
> TME and MKTME use the AES encryption algorithm in the AES-XTS mode. This mode,
> typically used for block-based storage devices, takes the physical address of
> the data into account when encrypting each block. This ensures that the
> effective key is different for each block of memory. Moving encrypted content
> across physical address results in garbage on read, mitigating block-relocation
> attacks. This property is the reason many of the discussed attacks require
> control of a shared physical page to be handed from the victim to the attacker.
>
> == MKTME-Provided Mitigations ==
>
> MKTME adds a few mitigations against attacks that are not mitigated when using
> TME alone. The first set are mitigations against software attacks that are
> familiar today:
>
> * Kernel Mapping Attacks: information disclosures that leverage the
> kernel direct map are mitigated against disclosing user data.
> * Freed Data Leak Attacks: removing an encryption key from the
> hardware mitigates future user information disclosure.
>
> The next set are attacks that depend on specialized hardware, such as an âevil
> DIMMâ or a DDR interposer:
>
> * Cross-Domain Replay Attack: data is captured from one domain
> (guest) and replayed to another at a later time.
> * Cross-Domain Capture and Delayed Compare Attack: data is captured
> and later analyzed to discover secrets.
> * Key Wear-out Attack: data is captured and analyzed in order to
> Weaken the AES encryption itself.
>
> More details on these attacks are below.
>
> === Kernel Mapping Attacks ===
>
> Information disclosure vulnerabilities leverage the kernel direct map because
> many vulnerabilities involve manipulation of kernel data structures (examples:
> CVE-2017-7277, CVE-2017-9605). We normally think of these bugs as leaking
> valuable *kernel* data, but they can leak application data when application
> pages are recycled for kernel use.
>
> With this MKTME implementation, there is a direct map created for each MKTME
> KeyID which is used whenever the kernel needs to access plaintext. But, all
> kernel data structures are accessed via the direct map for KeyID-0. Thus,
> memory reads which are not coordinated with the KeyID get garbage (for example,
> accessing KeyID-4 data with the KeyID-0 mapping).
>
> This means that if sensitive data encrypted using MKTME is leaked via the
> KeyID-0 direct map, ciphertext decrypted with the wrong key will be disclosed.
> To disclose plaintext, an attacker must âpivotâ to the correct direct mapping,
> which is non-trivial because there are no kernel data structures in the
> KeyID!=0 direct mapping.
>
> === Freed Data Leak Attack ===
>
> The kernel has a history of bugs around uninitialized data. Usually, we think
> of these bugs as leaking sensitive kernel data, but they can also be used to
> leak application secrets.
>
> MKTME can help mitigate the case where application secrets are leaked:
>
> * App (or VM) places a secret in a page
> * App exits or frees memory to kernel allocator
> * Page added to allocator free list
> * Attacker reallocates page to a purpose where it can read the page
>
> Now, imagine MKTME was in use on the memory being leaked. The data can only be
> leaked as long as the key is programmed in the hardware. If the key is
> de-programmed, like after all pages are freed after a guest is shut down, any
> future reads will just see ciphertext.
>
> Basically, the key is a convenient choke-point: you can be more confident that
> data encrypted with it is inaccessible once the key is removed.
>
> === Cross-Domain Replay Attack ===
>
> MKTME mitigates cross-domain replay attacks where an attacker replaces an
> encrypted block owned by one domain with a block owned by another domain.
> MKTME does not prevent this replacement from occurring, but it does mitigate
> plaintext from being disclosed if the domains use different keys.
>
> With TME, the attack could be executed by:
> * A victim places secret in memory, at a given physical address.
> Note: AES-XTS is what restricts the attack to being performed at a
> single physical address instead of across different physical
> addresses
> * Attacker captures victim secretâs ciphertext
> * Later on, after victim frees the physical address, attacker gains
> ownership
> * Attacker puts the ciphertext at the address and get the secret
> plaintext
>
> But, due to the presumably different keys used by the attacker and the victim,
> the attacker can not successfully decrypt old ciphertext.
>
> === Cross-Domain Capture and Delayed Compare Attack ===
>
> This is also referred to as a kind of dictionary attack.
>
> Similarly, MKTME protects against cross-domain capture-and-compare attacks.
> Consider the following scenario:
> * A victim places a secret in memory, at a known physical address
> * Attacker captures victimâs ciphertext
> * Attacker gains control of the target physical address, perhaps
> after the victimâs VM is shut down or its memory reclaimed.
> * Attacker computes and writes many possible plaintexts until new
> ciphertext matches content captured previously.
>
> Secrets which have low (plaintext) entropy are more vulnerable to this attack
> because they reduce the number of possible plaintexts an attacker has to
> compute and write.
>
> The attack will not work if attacker and victim uses different keys.
>
> === Key Wear-out Attack ===
>
> Repeated use of an encryption key might be used by an attacker to infer
> information about the key or the plaintext, weakening the encryption. The
> higher the bandwidth of the encryption engine, the more vulnerable the key is
> to wear-out. The MKTME memory encryption hardware works at the speed of the
> memory bus, which has high bandwidth.
>
> Such a weakness has been demonstrated[3] on a theoretical cipher with similar
> properties as AES-XTS.
>
> An attack would take the following steps:
> * Victim system is using TME with AES-XTS-128
> * Attacker repeatedly captures ciphertext/plaintext pairs (can be
> Performed with online hardware attack like an interposer).
> * Attacker compels repeated use of the key under attack for a
> sustained time period without a system reboot[4].
> * Attacker discovers a cipertext collision (two plaintexts
> translating to the same ciphertext)
> * Attacker can induce controlled modifications to the targeted
> plaintext by modifying the colliding ciphertext
>
> MKTME mitigates key wear-out in two ways:
> * Keys can be rotated periodically to mitigate wear-out. Since TME
> keys are generated at boot, rotation of TME keys requires a
> reboot. In contrast, MKTME allows rotation while the system is
> booted. An application could implement a policy to rotate keys at
> a frequency which is not feasible to attack.
> * In the case that MKTME is used to encrypt two guestsâ memory with
> two different keys, an attack on one guestâs key would not weaken
> the key used in the second guest.
>
> --
>
> [1] https://software.intel.com/sites/default/files/managed/a5/16/Multi-Key-Total-Memory-Encryption-Spec.pdf
> [2] The MKTME architecture supports up to 16 bits of KeyIDs, so a
> maximum of 65535 keys on top of the âTME keyâ at KeyID-0. The
> first implementation is expected to support 5 bits, making 63 keys
> available to applications. However, this is not guaranteed. The
> number of available keys could be reduced if, for instance,
> additional physical address space is desired over additional
> KeyIDs.
> [3] http://web.cs.ucdavis.edu/~rogaway/papers/offsets.pdf
> [4] This sustained time required for an attack could vary from days
> to years depending on the attackerâs goals.
>
> Alison Schofield (33):
> x86/pconfig: Set a valid encryption algorithm for all MKTME commands
> keys/mktme: Introduce a Kernel Key Service for MKTME
> keys/mktme: Preparse the MKTME key payload
> keys/mktme: Instantiate and destroy MKTME keys
> keys/mktme: Move the MKTME payload into a cache aligned structure
> keys/mktme: Strengthen the entropy of CPU generated MKTME keys
> keys/mktme: Set up PCONFIG programming targets for MKTME keys
> keys/mktme: Program MKTME keys into the platform hardware
> keys/mktme: Set up a percpu_ref_count for MKTME keys
> keys/mktme: Require CAP_SYS_RESOURCE capability for MKTME keys
> keys/mktme: Store MKTME payloads if cmdline parameter allows
> acpi: Remove __init from acpi table parsing functions
> acpi/hmat: Determine existence of an ACPI HMAT
> keys/mktme: Require ACPI HMAT to register the MKTME Key Service
> acpi/hmat: Evaluate topology presented in ACPI HMAT for MKTME
> keys/mktme: Do not allow key creation in unsafe topologies
> keys/mktme: Support CPU hotplug for MKTME key service
> keys/mktme: Find new PCONFIG targets during memory hotplug
> keys/mktme: Program new PCONFIG targets with MKTME keys
> keys/mktme: Support memory hotplug for MKTME keys
> mm: Generalize the mprotect implementation to support extensions
> syscall/x86: Wire up a system call for MKTME encryption keys
> x86/mm: Set KeyIDs in encrypted VMAs for MKTME
> mm: Add the encrypt_mprotect() system call for MKTME
> x86/mm: Keep reference counts on encrypted VMAs for MKTME
> mm: Restrict MKTME memory encryption to anonymous VMAs
> selftests/x86/mktme: Test the MKTME APIs
> x86/mktme: Overview of Multi-Key Total Memory Encryption
> x86/mktme: Document the MKTME provided security mitigations
> x86/mktme: Document the MKTME kernel configuration requirements
> x86/mktme: Document the MKTME Key Service API
> x86/mktme: Document the MKTME API for anonymous memory encryption
> x86/mktme: Demonstration program using the MKTME APIs
>
> Jacob Pan (3):
> iommu/vt-d: Support MKTME in DMA remapping
> x86/mm: introduce common code for mem encryption
> x86/mm: Use common code for DMA memory encryption
>
> Kai Huang (2):
> mm, x86: export several MKTME variables
> kvm, x86, mmu: setup MKTME keyID to spte for given PFN
>
> Kirill A. Shutemov (24):
> mm: Do no merge VMAs with different encryption KeyIDs
> mm: Add helpers to setup zero page mappings
> mm/ksm: Do not merge pages with different KeyIDs
> mm/page_alloc: Unify alloc_hugepage_vma()
> mm/page_alloc: Handle allocation for encrypted memory
> mm/khugepaged: Handle encrypted pages
> x86/mm: Mask out KeyID bits from page table entry pfn
> x86/mm: Introduce variables to store number, shift and mask of KeyIDs
> x86/mm: Preserve KeyID on pte_modify() and pgprot_modify()
> x86/mm: Detect MKTME early
> x86/mm: Add a helper to retrieve KeyID for a page
> x86/mm: Add a helper to retrieve KeyID for a VMA
> x86/mm: Add hooks to allocate and free encrypted pages
> x86/mm: Map zero pages into encrypted mappings correctly
> x86/mm: Rename CONFIG_RANDOMIZE_MEMORY_PHYSICAL_PADDING
> x86/mm: Allow to disable MKTME after enumeration
> x86/mm: Calculate direct mapping size
> x86/mm: Implement syncing per-KeyID direct mappings
> x86/mm: Handle encrypted memory in page_to_virt() and __pa()
> mm/page_ext: Export lookup_page_ext() symbol
> mm/rmap: Clear vma->anon_vma on unlink_anon_vmas()
> x86/mm: Disable MKTME on incompatible platform configurations
> x86/mm: Disable MKTME if not all system memory supports encryption
> x86: Introduce CONFIG_X86_INTEL_MKTME
>
> .../admin-guide/kernel-parameters.rst | 1 +
> .../admin-guide/kernel-parameters.txt | 11 +
> Documentation/x86/mktme/index.rst | 13 +
> .../x86/mktme/mktme_configuration.rst | 17 +
> Documentation/x86/mktme/mktme_demo.rst | 53 ++
> Documentation/x86/mktme/mktme_encrypt.rst | 57 ++
> Documentation/x86/mktme/mktme_keys.rst | 96 +++
> Documentation/x86/mktme/mktme_mitigations.rst | 150 ++++
> Documentation/x86/mktme/mktme_overview.rst | 57 ++
> Documentation/x86/x86_64/mm.txt | 4 +
> arch/alpha/include/asm/page.h | 2 +-
> arch/x86/Kconfig | 29 +-
> arch/x86/entry/syscalls/syscall_32.tbl | 1 +
> arch/x86/entry/syscalls/syscall_64.tbl | 1 +
> arch/x86/include/asm/intel-family.h | 2 +
> arch/x86/include/asm/intel_pconfig.h | 14 +-
> arch/x86/include/asm/mem_encrypt.h | 29 +
> arch/x86/include/asm/mktme.h | 93 +++
> arch/x86/include/asm/page.h | 4 +
> arch/x86/include/asm/page_32.h | 1 +
> arch/x86/include/asm/page_64.h | 4 +-
> arch/x86/include/asm/pgtable.h | 19 +
> arch/x86/include/asm/pgtable_types.h | 23 +-
> arch/x86/include/asm/setup.h | 6 +
> arch/x86/kernel/cpu/intel.c | 58 +-
> arch/x86/kernel/head64.c | 4 +
> arch/x86/kernel/setup.c | 3 +
> arch/x86/kvm/mmu.c | 18 +-
> arch/x86/mm/Makefile | 3 +
> arch/x86/mm/init_64.c | 68 ++
> arch/x86/mm/kaslr.c | 11 +-
> arch/x86/mm/mem_encrypt_common.c | 28 +
> arch/x86/mm/mktme.c | 630 ++++++++++++++
> drivers/acpi/hmat/hmat.c | 67 ++
> drivers/acpi/tables.c | 10 +-
> drivers/firmware/efi/efi.c | 25 +-
> drivers/iommu/intel-iommu.c | 29 +-
> fs/dax.c | 3 +-
> fs/exec.c | 4 +-
> fs/userfaultfd.c | 7 +-
> include/asm-generic/pgtable.h | 8 +
> include/keys/mktme-type.h | 39 +
> include/linux/acpi.h | 9 +-
> include/linux/dma-direct.h | 4 +-
> include/linux/efi.h | 1 +
> include/linux/gfp.h | 51 +-
> include/linux/intel-iommu.h | 9 +-
> include/linux/mem_encrypt.h | 23 +-
> include/linux/migrate.h | 14 +-
> include/linux/mm.h | 27 +-
> include/linux/page_ext.h | 11 +-
> include/linux/syscalls.h | 2 +
> include/uapi/asm-generic/unistd.h | 4 +-
> kernel/fork.c | 2 +
> kernel/sys_ni.c | 2 +
> mm/compaction.c | 3 +
> mm/huge_memory.c | 6 +-
> mm/khugepaged.c | 10 +
> mm/ksm.c | 17 +
> mm/madvise.c | 2 +-
> mm/memory.c | 3 +-
> mm/mempolicy.c | 30 +-
> mm/migrate.c | 4 +-
> mm/mlock.c | 2 +-
> mm/mmap.c | 31 +-
> mm/mprotect.c | 98 ++-
> mm/page_alloc.c | 50 ++
> mm/page_ext.c | 5 +
> mm/rmap.c | 4 +-
> mm/userfaultfd.c | 3 +-
> security/keys/Makefile | 1 +
> security/keys/mktme_keys.c | 768 ++++++++++++++++++
> .../selftests/x86/mktme/encrypt_tests.c | 433 ++++++++++
> .../testing/selftests/x86/mktme/flow_tests.c | 266 ++++++
> tools/testing/selftests/x86/mktme/key_tests.c | 526 ++++++++++++
> .../testing/selftests/x86/mktme/mktme_test.c | 300 +++++++
> 76 files changed, 4301 insertions(+), 122 deletions(-)
> create mode 100644 Documentation/x86/mktme/index.rst
> create mode 100644 Documentation/x86/mktme/mktme_configuration.rst
> create mode 100644 Documentation/x86/mktme/mktme_demo.rst
> create mode 100644 Documentation/x86/mktme/mktme_encrypt.rst
> create mode 100644 Documentation/x86/mktme/mktme_keys.rst
> create mode 100644 Documentation/x86/mktme/mktme_mitigations.rst
> create mode 100644 Documentation/x86/mktme/mktme_overview.rst
> create mode 100644 arch/x86/include/asm/mktme.h
> create mode 100644 arch/x86/mm/mem_encrypt_common.c
> create mode 100644 arch/x86/mm/mktme.c
> create mode 100644 include/keys/mktme-type.h
> create mode 100644 security/keys/mktme_keys.c
> create mode 100644 tools/testing/selftests/x86/mktme/encrypt_tests.c
> create mode 100644 tools/testing/selftests/x86/mktme/flow_tests.c
> create mode 100644 tools/testing/selftests/x86/mktme/key_tests.c
> create mode 100644 tools/testing/selftests/x86/mktme/mktme_test.c
>
> --
> 2.20.1
>

--
Sincerely yours,
Mike.