Re: [PATCH v3 1/2] x86/boot/64: Make level2_kernel_pgt pages invalid outside kernel area.
From: Kirill A. Shutemov
Date: Thu Sep 26 2019 - 06:23:03 EST
On Tue, Sep 24, 2019 at 04:03:55PM -0500, Steve Wahl wrote:
> Our hardware (UV aka Superdome Flex) has address ranges marked
> reserved by the BIOS. Access to these ranges is caught as an error,
> causing the BIOS to halt the system.
>
> Initial page tables mapped a large range of physical addresses that
> were not checked against the list of BIOS reserved addresses, and
> sometimes included reserved addresses in part of the mapped range.
> Including the reserved range in the map allowed processor speculative
> accesses to the reserved range, triggering a BIOS halt.
>
> Used early in booting, the page table level2_kernel_pgt addresses 1
> GiB divided into 2 MiB pages, and it was set up to linearly map a full
> 1 GiB of physical addresses that included the physical address range
> of the kernel image, as chosen by KASLR. But this also included a
> large range of unused addresses on either side of the kernel image.
> And unlike the kernel image's physical address range, this extra
> mapped space was not checked against the BIOS tables of usable RAM
> addresses. So there were times when the addresses chosen by KASLR
> would result in processor accessible mappings of BIOS reserved
> physical addresses.
>
> The kernel code did not directly access any of this extra mapped
> space, but having it mapped allowed the processor to issue speculative
> accesses into reserved memory, causing system halts.
>
> This was encountered somewhat rarely on a normal system boot, and much
> more often when starting the crash kernel if "crashkernel=512M,high"
> was specified on the command line (this heavily restricts the physical
> address of the crash kernel, in our case usually within 1 GiB of
> reserved space).
>
> The solution is to invalidate the pages of this table outside the
> kernel image's space before the page table is activated. This patch
> has been validated to fix this problem on our hardware.
>
> Signed-off-by: Steve Wahl <steve.wahl@xxxxxxx>
> Cc: stable@xxxxxxxxxxxxxxx
Acked-by: Kirill A. Shutemov <kirill.shutemov@xxxxxxxxxxxxxxx>
--
Kirill A. Shutemov