Re: [RFC 1/2] efi/memattr: Use desc_size instead of total size to check for corruption
From: Ard Biesheuvel
Date: Fri Jan 10 2025 - 12:26:04 EST
On Fri, 10 Jan 2025 at 11:53, Usama Arif <usamaarif642@xxxxxxxxx> wrote:
>
>
>
> On 10/01/2025 07:21, Ard Biesheuvel wrote:
> > On Thu, 9 Jan 2025 at 17:36, Usama Arif <usamaarif642@xxxxxxxxx> wrote:
> >>
> >>
> >>
> >> On 09/01/2025 15:45, Ard Biesheuvel wrote:
> >>> On Wed, 8 Jan 2025 at 23:00, Usama Arif <usamaarif642@xxxxxxxxx> wrote:
> >>>>
> >>>> The commit in [1] introduced a check to see if EFI memory attributes
> >>>> table was corrupted. It assumed that efi.memmap.nr_map remains
> >>>> constant, but it changes during late boot.
> >>>> Hence, the check is valid during cold boot, but not in the subsequent
> >>>> kexec boot.
> >>>>
> >>>> This is best explained with an exampled. At cold boot, for a test
> >>>> machine:
> >>>> efi.memmap.nr_map=91,
> >>>> memory_attributes_table->num_entries=48,
> >>>> desc_size = 48
> >>>> Hence, the check introduced in [1] where 3x the size of the
> >>>> entire EFI memory map is a reasonable upper bound for the size of this
> >>>> table is valid.
> >>>>
> >>>> In late boot __efi_enter_virtual_mode calls 2 functions that updates
> >>>> efi.memmap.nr_map:
> >>>> - efi_map_regions which reduces the `count` of map entries
> >>>> (for e.g. if should_map_region returns false) and which is reflected
> >>>> in efi.memmap by __efi_memmap_init.
> >>>> At this point efi.memmap.nr_map becomes 46 in the test machine.
> >>>> - efi_free_boot_services which also reduces the number of memory regions
> >>>> available (for e.g. if md->type or md->attribute is not the right value).
> >>>> At this point efi.memmap.nr_map becomes 9 in the test machine.
> >>>> Hence when you kexec into a new kernel and pass efi.memmap, the
> >>>> paramaters that are compared are:
> >>>> efi.memmap.nr_map=9,
> >>>> memory_attributes_table->num_entries=48,
> >>>> desc_size = 48
> >>>> where the check in [1] is no longer valid with such a low efi.memmap.nr_map
> >>>> as it was reduced due to efi_map_regions and efi_free_boot_services.
> >>>>
> >>>> A more appropriate check is to see if the description size reported by
> >>>> efi and memory attributes table is the same.
> >>>>
> >>>> [1] https://lore.kernel.org/all/20241031175822.2952471-2-ardb+git@xxxxxxxxxx/
> >>>>
> >>>> Fixes: 8fbe4c49c0cc ("efi/memattr: Ignore table if the size is clearly bogus")
> >>>> Reported-by: Breno Leitao <leitao@xxxxxxxxxx>
> >>>> Signed-off-by: Usama Arif <usamaarif642@xxxxxxxxx>
> >>>> ---
> >>>> drivers/firmware/efi/memattr.c | 16 ++++++----------
> >>>> 1 file changed, 6 insertions(+), 10 deletions(-)
> >>>>
> >>>
> >>> The more I think about this, the more I feel that kexec on x86 should
> >>> simply discard this table, and run with the firmware code RWX (which
> >>> is not the end of the world).
> >>
> >>
> >> By discard this table, do you mean kexec not use e820_table_firmware?
> >
> > No, I mean kexec ignores the memory attributes table.
> >
> >> Also a very basic question, what do you mean by run with the firmware RWX?
> >>
> >
> > The memory attributes table is an overlay for the EFI memory map that
> > describes which runtime code regions may be mapped with restricted
> > permissions. Without this table, everything will be mapped writable as
> > well as executable, but only in the EFI page tables, which are only
> > active when an EFI call is in progress.
> >
>
> Thanks for explaining!
>
> So basically get rid of memattr.c :)
>
> Do you mean get rid of it only for kexec, or not do it for any
> boot (including cold boot)?
Only for kexec, and only on x86
> I do like this idea! I couldn't find this in the git history,
> but do you know if this was added in the linux kernel just
> because EFI spec added support for it, or if there was a
> specific security problem?
>
Mapping memory RWX is generally a bad idea, so we should avoid it if
possible. But EFI runtime memory regions are only mapped during a EFI
runtime call, and these don't happen often at all, so the benefit is
only marginal. (In the early days of EFI, it was more common for the
OS to map these regions permanently, but we stopped doing that a long
time ago)