Re: [PATCH v7 1/3] efi/x86: Fix EFI memory map corruption with kexec

From: Kalra, Ashish
Date: Mon Jun 03 2024 - 13:06:08 EST


Re-sending this, last response got garbled.

On 6/3/2024 11:48 AM, Kalra, Ashish wrote:
On 6/3/2024 10:31 AM, Mike Rapoport wrote:

On Mon, Jun 03, 2024 at 04:46:39PM +0200, Borislav Petkov wrote:
On Mon, Jun 03, 2024 at 09:01:49AM -0500, Kalra, Ashish wrote:
If we skip efi_arch_mem_reserve() (which should probably be anyway skipped
for kexec case), then for kexec boot, EFI memmap is memremapped in the same
virtual address as the first kernel and not the allocated memblock address.
Are you saying that we should simply do

diff --git a/drivers/firmware/efi/efi.c b/drivers/firmware/efi/efi.c
index fdf07dd6f459..410cb0743289 100644
--- a/drivers/firmware/efi/efi.c
+++ b/drivers/firmware/efi/efi.c
@@ -577,6 +577,9 @@ void __init efi_mem_reserve(phys_addr_t addr, u64 size)
      if (WARN_ON_ONCE(efi_enabled(EFI_PARAVIRT)))
          return;
  +    if (kexec_in_progress)
+        return;
+
      if (!memblock_is_region_reserved(addr, size))
          memblock_reserve(addr, size);
  and skip that whole call?
I think Ashish suggested rather

diff --git a/drivers/firmware/efi/efi.c b/drivers/firmware/efi/efi.c
index fdf07dd6f459..eccc10ab15a4 100644
--- a/drivers/firmware/efi/efi.c
+++ b/drivers/firmware/efi/efi.c
@@ -580,6 +580,9 @@ void __init efi_mem_reserve(phys_addr_t addr, u64 size)
      if (!memblock_is_region_reserved(addr, size))
          memblock_reserve(addr, size);
  +    if (kexec_in_progress)
+        return;
+
      /*
       * Some architectures (x86) reserve all boot services ranges
       * until efi_free_boot_services() because of buggy firmware
Yes, something similar as above, as efi_mem_reserve() is used to reserve boot service memory and is not necessary for kexec boot.

So, Dave Young (dyoung@xxxxxxxxxx) had suggested that we skip efi_arch_mem_reserve() for kexec by checking the set EFI_MEMORY_RUNTIME attribute as below:

diff --git a/arch/x86/platform/efi/quirks.c b/arch/x86/platform/efi/quirks.c
index f0cc00032751..6f398c59278a 100644
--- a/arch/x86/platform/efi/quirks.c
+++ b/arch/x86/platform/efi/quirks.c
@@ -255,15 +255,39 @@ void __init efi_arch_mem_reserve(phys_addr_t addr, u64 size)
        struct efi_memory_map_data data = { 0 };
        struct efi_mem_range mr;
        efi_memory_desc_t md;
-       int num_entries;
+       int num_entries, ret;
        void *new;

-       if (efi_mem_desc_lookup(addr, &md) ||
-           md.type != EFI_BOOT_SERVICES_DATA) {
+       /*
+        * efi_mem_reserve() is used to reserve boot service memory, eg. bgrt,
+        * but it is not neccasery for kexec, as there are no boot services in
+        * kexec reboot at all after the first kernel's ExitBootServices().
+        *
+        * Therefore, skip efi_mem_reserve for kexec booting by checking the
+        * EFI_MEMORY_RUNTIME attribute which indicates boot service memory
+        * ranges reserved by the first kernel using efi_mem_reserve and marked
+        * with EFI_MEMORY_RUNTIME attribute.
+        */
+
+       ret = efi_mem_desc_lookup(addr, &md);

+       if (ret) {

                pr_err("Failed to lookup EFI memory descriptor for %pa\n", &addr);
                return;
        }

+       if (md.type != EFI_BOOT_SERVICES_DATA) {
+               pr_err("Skip reserving non EFI Boot Service Data memory for %pa\n", &addr);
+               return;
+       }
+
+       /* Kexec copied the efi memmap from the first kernel, thus skip the case */
+       if (md.attribute & EFI_MEMORY_RUNTIME)
+               return;
+
        if (addr + size > md.phys_addr + (md.num_pages << EFI_PAGE_SHIFT)) {
                pr_err("Region spans EFI memory descriptors, %pa\n", &addr);
                return;

Thanks, Ashish