Re: [PATCH v7 1/3] efi/x86: Fix EFI memory map corruption with kexec

From: Kalra, Ashish
Date: Mon Jun 03 2024 - 10:02:09 EST


On 6/3/2024 8:39 AM, Mike Rapoport wrote:

On Mon, Jun 03, 2024 at 08:06:56AM -0500, Kalra, Ashish wrote:
On 6/3/2024 3:56 AM, Borislav Petkov wrote

EFI memory map and due to early allocation it uses memblock allocation.

Later during boot, efi_enter_virtual_mode() calls kexec_enter_virtual_mode()
in case of a kexec-ed kernel boot.

This function kexec_enter_virtual_mode() installs the new EFI memory map by
calling efi_memmap_init_late() which remaps the efi_memmap physically allocated
in efi_arch_mem_reserve(), but this remapping is still using memblock allocation.

Subsequently, when memblock is freed later in boot flow, this remapped
efi_memmap will have random corruption (similar to a use-after-free scenario).

The corrupted EFI memory map is then passed to the next kexec-ed kernel
which causes a panic when trying to use the corrupted EFI memory map.
This sounds fishy: memblock allocated memory is not freed later in the
boot - it remains reserved. Only free memory is freed from memblock to
the buddy allocator.

Or is the problem that memblock-allocated memory cannot be memremapped
because *raisins*?
This is what seems to be happening:

efi_arch_mem_reserve() calls efi_memmap_alloc() to allocate memory for
EFI memory map and due to early allocation it uses memblock allocation.

And later efi_enter_virtual_mode() calls kexec_enter_virtual_mode()
in case of a kexec-ed kernel boot.

This function kexec_enter_virtual_mode() installs the new EFI memory map by
calling efi_memmap_init_late() which does memremap() on memblock-allocated memory.
Does the issue happen only with SNP?

This is observed under SNP as efi_arch_mem_reserve() is only being called with SNP enabled and then efi_arch_mem_reserve() allocates EFI memory map using memblock.

If we skip efi_arch_mem_reserve() (which should probably be anyway skipped for kexec case), then for kexec boot, EFI memmap is memremapped in the same virtual address as the first kernel and not the allocated memblock address.

Thanks, Ashish


I didn't really dig, but my theory would be that it has something to do
with arch_memremap_can_ram_remap() in arch/x86/mm/ioremap.c
Thanks, Ashish