Re: [RFC 2/2] efi/memattr: add efi_mem_attr_table as a reserved region in 820_table_firmware
From: Gregory Price
Date: Thu Jan 09 2025 - 11:47:49 EST
On Thu, Jan 09, 2025 at 04:32:10PM +0000, Usama Arif wrote:
>
>
> On 09/01/2025 16:15, Ard Biesheuvel wrote:
> > On Wed, 8 Jan 2025 at 23:00, Usama Arif <usamaarif642@xxxxxxxxx> wrote:
> >>
> >> When this area is not reserved, it comes up as usable in
> >> /sys/firmware/memmap. This means that kexec, which uses that memmap
> >> to find usable memory regions, can select the region where
> >> efi_mem_attr_table is and overwrite it and relocate_kernel.
> >>
> >> Since the patch in [1] was merged, all boots after kexec
> >> are producing the warning that it introduced.
> >>
> >> Having a fix in firmware can be difficult to get through.
> >
> > I don't follow. I don't think there is anything wrong with the
> > firmware here. Could you elaborate?
> >
>
> So the problem is, kexec sees this memory as System RAM, and decides
> it can be used to place an image here.
>
> I guess the question is (and I actually don't know the answer here),
> whose responsibility is it to mark this region as reserved so that
> its not touched by anyone else. I would have thought it should be
> firmware?
>
> Maybe its not the firmwares' job to mark it as reserved, but just pass
> it to kernel and the kernel is supposed to make sure it gets reserved
> in a proper way, even across kexecs.
Reservation is a kernel concept, the firmware/EFI etc give the kernel
guidance - but the actual maps here are largely informational and the
kernel can do whatever it wants (within reason).
So what you're really saying is either
a) the hardware is mis-reporting the memory map configurations
(e.g. some EFI_* bit is missing), or
b) the kernel isn't doing the right thing.
I believe Ard is pointing out that the additional map abstraction is just
trying to retain a "current" versus "original" configuration - and this
is more of a headache than it is worth as it appears to be causing
confusion as to which one should be respected across kexec.
So the separate maps should just be one, and we should just smash the
"original" configuration rather than keeping a copy.
Please correct me if i'm misunderstanding anything.
>
> I think in the end whoevers' responsibility it is, the easiest path forward
> seems to be in kernel? (and not firmware or libstub)
>
> >
> >> The next ideal place would be in libstub. However, it looks like
> >> InstallMemoryAttributesTable [2] is not available as a boot service
> >> call option [3], [4], and install_configuration_table does not
> >> seem to work as a valid substitute.
> >>
> >
> > To do what, exactly?
> >
>
> To change the memory type from System RAM to either reserved or
> something more appropriate, i.e. any type that is not touched by
> kexec or any other userspace.
>
> Basically the example code I attached at the end of the cover letter in
> https://lore.kernel.org/all/20250108215957.3437660-1-usamaarif642@xxxxxxxxx/
> It could be EFI_ACPI_RECLAIM_MEMORY or EFI_RESERVED_TYPE, both of which aren't
> touched by kexec.
>
> >> As a last option for a fix, this patch marks that region as reserved in
> >> e820_table_firmware if it is currently E820_TYPE_RAM so that kexec doesn't
> >> use it for kernel segments.
> >>
> >> [1] https://lore.kernel.org/all/20241031175822.2952471-2-ardb+git@xxxxxxxxxx/
> >> [2] https://github.com/tianocore/edk2/blob/master/MdeModulePkg/Core/Dxe/Misc/MemoryAttributesTable.c#L100
> >> [3] https://github.com/tianocore/edk2/blob/42a141800c0c26a09d2344e84a89ce4097a263ae/MdeModulePkg/Core/Dxe/DxeMain/DxeMain.c#L41
> >> [4] https://elixir.bootlin.com/linux/v6.12.6/source/drivers/firmware/efi/libstub/efistub.h#L327
> >>
> >> Reported-by: Breno Leitao <leitao@xxxxxxxxxx>
> >> Signed-off-by: Usama Arif <usamaarif642@xxxxxxxxx>
> >> ---
> >> arch/x86/include/asm/e820/api.h | 2 ++
> >> arch/x86/kernel/e820.c | 6 ++++++
> >> arch/x86/platform/efi/efi.c | 9 +++++++++
> >> drivers/firmware/efi/memattr.c | 1 +
> >> include/linux/efi.h | 7 +++++++
> >> 5 files changed, 25 insertions(+)
> >>
> >> diff --git a/arch/x86/include/asm/e820/api.h b/arch/x86/include/asm/e820/api.h
> >> index 2e74a7f0e935..4e9aa24f03bd 100644
> >> --- a/arch/x86/include/asm/e820/api.h
> >> +++ b/arch/x86/include/asm/e820/api.h
> >> @@ -16,6 +16,8 @@ extern bool e820__mapped_all(u64 start, u64 end, enum e820_type type);
> >>
> >> extern void e820__range_add (u64 start, u64 size, enum e820_type type);
> >> extern u64 e820__range_update(u64 start, u64 size, enum e820_type old_type, enum e820_type new_type);
> >> +extern u64 e820__range_update_firmware(u64 start, u64 size, enum e820_type old_type,
> >> + enum e820_type new_type);
> >> extern u64 e820__range_remove(u64 start, u64 size, enum e820_type old_type, bool check_type);
> >> extern u64 e820__range_update_table(struct e820_table *t, u64 start, u64 size, enum e820_type old_type, enum e820_type new_type);
> >>
> >> diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c
> >> index 82b96ed9890a..01d7d3c0d299 100644
> >> --- a/arch/x86/kernel/e820.c
> >> +++ b/arch/x86/kernel/e820.c
> >> @@ -538,6 +538,12 @@ u64 __init e820__range_update_table(struct e820_table *t, u64 start, u64 size,
> >> return __e820__range_update(t, start, size, old_type, new_type);
> >> }
> >>
> >> +u64 __init e820__range_update_firmware(u64 start, u64 size, enum e820_type old_type,
> >> + enum e820_type new_type)
> >> +{
> >> + return __e820__range_update(e820_table_firmware, start, size, old_type, new_type);
> >> +}
> >> +
> >> /* Remove a range of memory from the E820 table: */
> >> u64 __init e820__range_remove(u64 start, u64 size, enum e820_type old_type, bool check_type)
> >> {
> >> diff --git a/arch/x86/platform/efi/efi.c b/arch/x86/platform/efi/efi.c
> >> index a7ff189421c3..13684c5d7c05 100644
> >> --- a/arch/x86/platform/efi/efi.c
> >> +++ b/arch/x86/platform/efi/efi.c
> >> @@ -168,6 +168,15 @@ static void __init do_add_efi_memmap(void)
> >> e820__update_table(e820_table);
> >> }
> >>
> >> +/* Reserve firmware area if it was marked as RAM */
> >> +void arch_update_firmware_area(u64 addr, u64 size)
> >> +{
> >> + if (e820__get_entry_type(addr, addr + size) == E820_TYPE_RAM) {
> >> + e820__range_update_firmware(addr, size, E820_TYPE_RAM, E820_TYPE_RESERVED);
> >> + e820__update_table(e820_table_firmware);
> >> + }
> >> +}
> >> +
> >> /*
> >> * Given add_efi_memmap defaults to 0 and there is no alternative
> >> * e820 mechanism for soft-reserved memory, import the full EFI memory
> >> diff --git a/drivers/firmware/efi/memattr.c b/drivers/firmware/efi/memattr.c
> >> index d3bc161361fb..d131781e2d7b 100644
> >> --- a/drivers/firmware/efi/memattr.c
> >> +++ b/drivers/firmware/efi/memattr.c
> >> @@ -53,6 +53,7 @@ int __init efi_memattr_init(void)
> >> size = tbl->num_entries * tbl->desc_size;
> >> tbl_size = sizeof(*tbl) + size;
> >> memblock_reserve(efi_mem_attr_table, tbl_size);
> >> + arch_update_firmware_area(efi_mem_attr_table, tbl_size);
> >> set_bit(EFI_MEM_ATTR, &efi.flags);
> >>
> >> unmap:
> >> diff --git a/include/linux/efi.h b/include/linux/efi.h
> >> index e5815867aba9..8eb9698bd6a4 100644
> >> --- a/include/linux/efi.h
> >> +++ b/include/linux/efi.h
> >> @@ -1358,4 +1358,11 @@ extern struct blocking_notifier_head efivar_ops_nh;
> >> void efivars_generic_ops_register(void);
> >> void efivars_generic_ops_unregister(void);
> >>
> >> +#ifdef CONFIG_X86_64
> >> +void __init arch_update_firmware_area(u64 addr, u64 size);
> >> +#else
> >> +static inline void __init arch_update_firmware_area(u64 addr, u64 size)
> >> +{
> >> +}
> >> +#endif
> >> #endif /* _LINUX_EFI_H */
> >> --
> >> 2.43.5
> >>
>