Re: [PATCH v4 4/4] efi: Fix handling of multiple efi_fake_mem= entries

From: Ard Biesheuvel
Date: Thu Jan 09 2020 - 04:36:08 EST


On Wed, 8 Jan 2020 at 22:53, Dan Williams <dan.j.williams@xxxxxxxxx> wrote:
>
> On Tue, Jan 7, 2020 at 9:52 AM Ard Biesheuvel <ard.biesheuvel@xxxxxxxxxx> wrote:
> >
> > On Tue, 7 Jan 2020 at 06:19, Dave Young <dyoung@xxxxxxxxxx> wrote:
> > >
> > > On 01/06/20 at 08:16pm, Dan Williams wrote:
> > > > On Mon, Jan 6, 2020 at 8:04 PM Dave Young <dyoung@xxxxxxxxxx> wrote:
> > > > >
> > > > > On 01/06/20 at 04:40pm, Dan Williams wrote:
> > > > > > Dave noticed that when specifying multiple efi_fake_mem= entries only
> > > > > > the last entry was successfully being reflected in the efi memory map.
> > > > > > This is due to the fact that the efi_memmap_insert() is being called
> > > > > > multiple times, but on successive invocations the insertion should be
> > > > > > applied to the last new memmap rather than the original map at
> > > > > > efi_fake_memmap() entry.
> > > > > >
> > > > > > Rework efi_fake_memmap() to install the new memory map after each
> > > > > > efi_fake_mem= entry is parsed.
> > > > > >
> > > > > > This also fixes an issue in efi_fake_memmap() that caused it to litter
> > > > > > emtpy entries into the end of the efi memory map. An empty entry causes
> > > > > > efi_memmap_insert() to attempt more memmap splits / copies than
> > > > > > efi_memmap_split_count() accounted for when sizing the new map. When
> > > > > > that happens efi_memmap_insert() may overrun its allocation, and if you
> > > > > > are lucky will spill over to an unmapped page leading to crash
> > > > > > signature like the following rather than silent corruption:
> > > > > >
> > > > > > BUG: unable to handle page fault for address: ffffffffff281000
> > > > > > [..]
> > > > > > RIP: 0010:efi_memmap_insert+0x11d/0x191
> > > > > > [..]
> > > > > > Call Trace:
> > > > > > ? bgrt_init+0xbe/0xbe
> > > > > > ? efi_arch_mem_reserve+0x1cb/0x228
> > > > > > ? acpi_parse_bgrt+0xa/0xd
> > > > > > ? acpi_table_parse+0x86/0xb8
> > > > > > ? acpi_boot_init+0x494/0x4e3
> > > > > > ? acpi_parse_x2apic+0x87/0x87
> > > > > > ? setup_acpi_sci+0xa2/0xa2
> > > > > > ? setup_arch+0x8db/0x9e1
> > > > > > ? start_kernel+0x6a/0x547
> > > > > > ? secondary_startup_64+0xb6/0xc0
> > > > > >
> > > > > > Commit af1648984828 "x86/efi: Update e820 with reserved EFI boot
> > > > > > services data to fix kexec breakage" is listed in Fixes: since it
> > > > > > introduces more occurrences where efi_memmap_insert() is invoked after
> > > > > > an efi_fake_mem= configuration has been parsed. Previously the side
> > > > > > effects of vestigial empty entries were benign, but with commit
> > > > > > af1648984828 that follow-on efi_memmap_insert() invocation triggers
> > > > > > efi_memmap_insert() overruns.
> > > > > >
> > > > > > Fixes: 0f96a99dab36 ("efi: Add 'efi_fake_mem' boot option")
> > > > > > Fixes: af1648984828 ("x86/efi: Update e820 with reserved EFI boot services...")
> > > > >
> > > > > A nitpick for the Fixes flags, as I replied in the thread below:
> > > > > https://lore.kernel.org/linux-efi/CAPcyv4jLxqPaB22Ao9oV31Gm=b0+Phty+Uz33Snex4QchOUb0Q@xxxxxxxxxxxxxx/T/#m2bb2dd00f7715c9c19ccc48efef0fcd5fdb626e7
> > > > >
> > > > > I reproduced two other panics without the patches applied, so this issue
> > > > > is not caused by either of the commits, maybe just drop the Fixes.
> > > >
> > > > Just the "Fixes: af1648984828", right? No objection from me. I'll let
> > > > Ingo say if he needs a resend for that.
> > > >
> > > > The "Fixes: 0f96a99dab36" is valid because the original implementation
> > > > failed to handle the multiple argument case from the beginning.
> > >
> > > Agreed, thanks!
> > >
> >
> > I'll queue this but without the fixes tags. The -stable maintainers
> > are far too trigger happy IMHO, and this really needs careful review
> > before being backported. efi_fake_mem is a debug feature anyway, so I
> > don't see an urgent need to get this fixed retroactively in older
> > kernels.
>
> I'm fine to drop the fixes tags.
>
> However, I do want to point out my driving motive for digging in on
> efi_fake_mem=nn@ss:0x40000, is that it is a better interface for
> diverting memory ranges to device-dax than the current standard bearer
> memmap=nn!ss. The main benefit is that the kernel only considers the
> attribute when it is applied to EFI_CONVENTIONAL_MEMORY. This fixes a
> long standing unsolvable issue of people picking busted memmap=nn!ss
> settings that clobber platform memory ranges, or vector off into
> nothing.
>
> So yes, efi_fake_mem is a debug feature, but if the popularity of
> memmap=nn!ss is any clue I expect efi_fake_mem=nn@ss:0x40000 will be a
> useful tool going forward for late enabling, or repairing platform
> "soft reservation" declarations.
>

OK, good to know.

> I'll respin the series with those tags dropped and add the comment you
> recommended about the cases when efi_memmap_free() is expected to be a
> nop. Holler if there's anything else, but that's all I had on my list
> to fix up.

If it's just for the comment, I can just slap that on, as I already
queued the patches with the fixes tags dropped. Or respin, whichever
you prefer (efi/next branch is not stable anyway)