Re: [PATCH] x86/boot/64: Make level2_kernel_pgt pages invalid outside kernel area.
From: Kirill A. Shutemov
Date: Mon Sep 09 2019 - 04:14:20 EST
On Fri, Sep 06, 2019 at 04:29:50PM -0500, Steve Wahl wrote:
> Our hardware (UV aka Superdome Flex) has address ranges marked
> reserved by the BIOS. These ranges can cause the system to halt if
> accessed.
>
> During kernel initialization, the processor was speculating into
> reserved memory causing system halts. The processor speculation is
> enabled because the reserved memory is being mapped by the kernel.
>
> The page table level2_kernel_pgt is 1 GiB in size, and had all pages
> initially marked as valid, and the kernel is placed anywhere in this
> range depending on the virtual address selected by KASLR. Later on in
> the boot process, the valid area gets trimmed back to the space
> occupied by the kernel.
>
> But during the interval of time when the full 1 GiB space was marked
> as valid, if the kernel physical address chosen by KASLR was close
> enough to our reserved memory regions, the valid pages outside the
> actual kernel space were allowing the processor to issue speculative
> accesses to the reserved space, causing the system to halt.
>
> This was encountered somewhat rarely on a normal system boot, and
> somewhat more often when starting the crash kernel if
> "crashkernel=512M,high" was specified on the command line (because
> this heavily restricts the physical address of the crash kernel,
> usually to within 1 GiB of our reserved space).
>
> The answer is to invalidate the pages of this table outside the
> address range occupied by the kernel before the page table is
> activated. This patch has been validated to fix this problem on our
> hardware.
If the goal is to avoid *any* mapping of the reserved region to stop
speculation, I don't think this patch will do the job. We still (likely)
have the same memory mapped as part of the identity mapping. And it
happens at least in two places: here and before on decompression stage.
--
Kirill A. Shutemov