Re: [PATCH v2 1/3] efi/x86: skip efi_arch_mem_reserve() in case of kexec.

From: Dave Young
Date: Tue Mar 19 2024 - 00:00:44 EST


Hi,

Added Ard in cc.

On 03/18/24 at 07:02am, Ashish Kalra wrote:
> From: Ashish Kalra <ashish.kalra@xxxxxxx>
>
> For kexec use case, need to use and stick to the EFI memmap passed
> from the first kernel via boot-params/setup data, hence,
> skip efi_arch_mem_reserve() during kexec.
>
> Additionally during SNP guest kexec testing discovered that EFI memmap
> is corrupted during chained kexec. kexec_enter_virtual_mode() during
> late init will remap the efi_memmap physical pages allocated in
> efi_arch_mem_reserve() via memboot & then subsequently cause random
> EFI memmap corruption once memblock is freed/teared-down.
>
> Signed-off-by: Ashish Kalra <ashish.kalra@xxxxxxx>
> ---
> arch/x86/platform/efi/quirks.c | 10 ++++++++++
> 1 file changed, 10 insertions(+)
>
> diff --git a/arch/x86/platform/efi/quirks.c b/arch/x86/platform/efi/quirks.c
> index f0cc00032751..d4562d074371 100644
> --- a/arch/x86/platform/efi/quirks.c
> +++ b/arch/x86/platform/efi/quirks.c
> @@ -258,6 +258,16 @@ void __init efi_arch_mem_reserve(phys_addr_t addr, u64 size)
> int num_entries;
> void *new;
>
> + /*
> + * For kexec use case, we need to use the EFI memmap passed from the first
> + * kernel via setup data, so we need to skip this.
> + * Additionally kexec_enter_virtual_mode() during late init will remap
> + * the efi_memmap physical pages allocated here via memboot & then
> + * subsequently cause random EFI memmap corruption once memblock is freed.

Can you elaborate a bit about the corruption, is it reproducible without
SNP?

> + */
> + if (efi_setup)
> + return;
> +

How about checking the md attribute instead of checking the efi_setup,
personally I feel it a bit better, something like below:

diff --git a/arch/x86/platform/efi/quirks.c b/arch/x86/platform/efi/quirks.c
index f0cc00032751..699332b075bb 100644
--- a/arch/x86/platform/efi/quirks.c
+++ b/arch/x86/platform/efi/quirks.c
@@ -255,15 +255,24 @@ void __init efi_arch_mem_reserve(phys_addr_t addr, u64 size)
struct efi_memory_map_data data = { 0 };
struct efi_mem_range mr;
efi_memory_desc_t md;
- int num_entries;
+ int num_entries, ret;
void *new;

- if (efi_mem_desc_lookup(addr, &md) ||
- md.type != EFI_BOOT_SERVICES_DATA) {
+ ret = efi_mem_desc_lookup(addr, &md);
+ if (ret) {
pr_err("Failed to lookup EFI memory descriptor for %pa\n", &addr);
return;
}

+ if (md.type != EFI_BOOT_SERVICES_DATA) {
+ pr_err("Skil reserving non EFI Boot Service Data memory for %pa\n", &addr);
+ return;
+ }
+
+ /* Kexec copied the efi memmap from the 1st kernel, thus skip the case. */
+ if (md.attribute & EFI_MEMORY_RUNTIME)
+ return;
+
if (addr + size > md.phys_addr + (md.num_pages << EFI_PAGE_SHIFT)) {
pr_err("Region spans EFI memory descriptors, %pa\n", &addr);
return;