Re: [PATCH] x86/boot: Reorganize and clean up the BIOS area reservation code

From: Andy Lutomirski
Date: Thu Jul 21 2016 - 18:45:40 EST

On Thu, Jul 21, 2016 at 2:48 PM, Andy Lutomirski <luto@xxxxxxxxxxxxxx> wrote:
> On Thu, Jul 21, 2016 at 2:28 PM, H. Peter Anvin <hpa@xxxxxxxxx> wrote:
>> On July 21, 2016 2:08:12 PM PDT, Andy Lutomirski <luto@xxxxxxxxxxxxxx> wrote:
>>>On Thu, Jul 21, 2016 at 9:18 AM, Ingo Molnar <mingo@xxxxxxxxxx> wrote:
>>>> * Andy Lutomirski <luto@xxxxxxxxxxxxxx> wrote:
>>>>> It would be very easy to implement this if we could handle
>>>overlapping memblocks
>>>>> precisely or set a lower limit on the memblock allocator. Then we
>>>could block
>>>>> off everything below 1MB or 2MB very early and then unblock it or
>>>>> change the lower limit and ask for a single page for the trampoline
>>>after that.
>>>> So my suggestion was/is to _permanently_ allocate the SMP trampoline
>>>page, and
>>>> leave it also reserved.
>>>> 'Reserving' a memory area is really just a kernel internal matter. We
>>>can still
>>>> use it. No need to unreserve/allocate/re-reserve ... unless I'm
>>>missing something.
>>>I don't think you're missing anything particularly deep. I'm just
>>>talking about an implementation issue. We need to make sure that the
>>>page we pick for the trampoline isn't reserved in the memory map or by
>>>some other quirk (including the EBDA). The kernel currently uses
>>>memblock for this, which means that we should probably play nicely
>>>with the memblock code.
>>>To fix my laptop, though, I think we either need to change the EBDA
>>>reservation (i.e. be willing to pick a page above the EBDA but below
>>>the BIOS) or rework the code so that it can use a BOOT_SERVICES_DATA
>> Oh... this is booting in EFI mode. Now it makes a bit more sense. This is a headache because I believe the same company has put out systems which crash if boot services memory is reclaimed before driver initialization, which is the only reason we can't just treat it as plain memory immediately after invoking ExitBootServices.
>> In theory EBDA in EFI mode is nonsense, too, but not trying to be smart about it might break some systems due to interaction with ACPI.
> It's easy to make reserve_bios_regions() do nothing if we have an EFI
> memory map. Is there any decent reason *not* to do that? Or we could
> be more conservative: if the EFI memory map reserves 640K-1M and
> reserves the EBDA (if any), then don't reserve the range between the
> EBDA and 640K. (That would fix my laptop, too.)
> Allowing the trampoline to use boot services addresses is fine, too,
> but that looks harder to implement.

I looked at the code some more. The boot services quirk is weird and
maybe buggy. trim_snb_memory uses memblock_reserve to reserve the
bottom 1MB. If efi_reserve_real_mode has already reserved that range,
then trim_snb_memory's reservation will have no effect because the efi
code will just free it later on. The same issue will hit any code
that reserves >1MB memory after efi has tried to temporarily reserve

I don't have any great suggestions for cleaning it up. Perhaps the
efi code should instead skip adding boot services memory to the memory
map in the first place and then add it late and hand any unreserved
bits to the buddy allocator?