[PATCH] x86/boot/64: Make level2_kernel_pgt pages invalid outside kernel area.

From: Steve Wahl
Date: Fri Sep 06 2019 - 17:30:23 EST


Our hardware (UV aka Superdome Flex) has address ranges marked
reserved by the BIOS. These ranges can cause the system to halt if
accessed.

During kernel initialization, the processor was speculating into
reserved memory causing system halts. The processor speculation is
enabled because the reserved memory is being mapped by the kernel.

The page table level2_kernel_pgt is 1 GiB in size, and had all pages
initially marked as valid, and the kernel is placed anywhere in this
range depending on the virtual address selected by KASLR. Later on in
the boot process, the valid area gets trimmed back to the space
occupied by the kernel.

But during the interval of time when the full 1 GiB space was marked
as valid, if the kernel physical address chosen by KASLR was close
enough to our reserved memory regions, the valid pages outside the
actual kernel space were allowing the processor to issue speculative
accesses to the reserved space, causing the system to halt.

This was encountered somewhat rarely on a normal system boot, and
somewhat more often when starting the crash kernel if
"crashkernel=512M,high" was specified on the command line (because
this heavily restricts the physical address of the crash kernel,
usually to within 1 GiB of our reserved space).

The answer is to invalidate the pages of this table outside the
address range occupied by the kernel before the page table is
activated. This patch has been validated to fix this problem on our
hardware.

Signed-off-by: Steve Wahl <steve.wahl@xxxxxxx>
Cc: stable@xxxxxxxxxxxxxxx
---
arch/x86/kernel/head64.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
index 29ffa495bd1c..31f89a5defa3 100644
--- a/arch/x86/kernel/head64.c
+++ b/arch/x86/kernel/head64.c
@@ -225,10 +225,15 @@ unsigned long __head __startup_64(unsigned long physaddr,
*/

pmd = fixup_pointer(level2_kernel_pgt, physaddr);
- for (i = 0; i < PTRS_PER_PMD; i++) {
+ for (i = 0; i < pmd_index((unsigned long)_text); i++)
+ pmd[i] &= ~_PAGE_PRESENT;
+
+ for (; i <= pmd_index((unsigned long)_end); i++)
if (pmd[i] & _PAGE_PRESENT)
pmd[i] += load_delta;
- }
+
+ for (; i < PTRS_PER_PMD; i++)
+ pmd[i] &= ~_PAGE_PRESENT;

/*
* Fixup phys_base - remove the memory encryption mask to obtain
--
2.21.0


--
Steve Wahl, Hewlett Packard Enterprise