Re: [PATCH] x86/Kconfig: decrease maximum of X86_RESERVE_LOW to 512K

From: Mike Rapoport
Date: Thu May 27 2021 - 09:38:21 EST


On Wed, May 26, 2021 at 08:14:44PM +0200, Borislav Petkov wrote:
> On Wed, May 26, 2021 at 07:30:09PM +0300, Mike Rapoport wrote:
> > We can restore that behaviour, but it feels like cheating to me. We let
> > user say "Hey, don't touch low memory at all", even though we know we must
> > use at least some of it. And then we sneak in an allocation under 640K
> > despite user's request not to use it.
>
> Sure but how are we going to tell the user that if we don't sneak that
> allocation, we won't boot at all. I believe user would kinda like the
> box to boot still, no? :-)
>
> Yeah, you have that now:
>
> + Note, that a part of the low memory range is still required for
> + kernel to boot properly.
>
> but then why is 512 ok? And why was 640K the upper limit?

Well 640K is well known memory limit :)
And 512k is the closest power of 2 which still leaves plenty of space for
the trampoline.

> Looking at:
>
> d0cd7425fab7 ("x86, bios: By default, reserve the low 64K for all BIOSes")
>
> and reading that bugzilla
>
> https://bugzilla.kernel.org/show_bug.cgi?id=16661
>
> it sounds like it is the amount of memory where BIOS could put crap in.
>
> Long story short, we reserve the first 64K by default so if someone
> reserves the total range of 640K the early code could probably say
> something like
>
> "adjusting upper reserve limit to X for the real-time trampoline"
>
> when the upper limit is too high so that a trampoline can't fit...
>
> Which is basically what your solution does...
>
> But then the previous behavior used to work everywhere so if it is only
> cheating, I don't mind doing that as long as boxes keep on booting.
>
> Or am I missing an aspect?

Another aspect IMHO is that making things explicit would reduce the amount
of hidden dependencies and in the end make x86::setup_arch() less fragile.

I'm looking now also at:

5bc653b73182 ("x86/efi: Allocate a trampoline if needed in efi_free_boot_services()")

that retries the allocation of trampoline when we free EFI services, so
there is also could be a conflict between reserve_real_mode() and
reserve_bios_regions() in case EBDA is too low.

So what we have is
- BIOSes that corrupt low memory
- EBDA of unknown size that can be as low as 128k, so we reserve everything
from EBDA start to 640k because we don't trust BIOSes to report EBDA size
properly
- Real mode blob of about 20-30k that must live in the first 640k
- Build time setting to reserve Xk (4K <= X <= 640k) with the default set
to 64k
- Command line option to reserve Yk (4K <= Y <= 640k), this takes precedence
over the build time option.
- A late fallback that uses memory freed from EFI data to place real mode
trampoline there

It seems to me that we can drop both build time and run time options
entirely, reserve 64k early to avoid having trampoline there and then
always reserve everything below 640k after reserve_real_mode().

The late fallback for systems that have most of low memory busy with
BIOS/EFI will remain intact as it does not do memblock allocation anyway.

--
Sincerely yours,
Mike.