Re: [PATCH] x86/boot/KASLR: exclude EFI_BOOT_SERVICES_{CODE|DATA} from KASLR's choice

From: Baoquan He
Date: Sun Jul 09 2017 - 06:44:44 EST


On 07/07/17 at 11:56am, Matt Fleming wrote:
> On Fri, 07 Jul, at 11:07:59AM, Baoquan He wrote:
> > On 07/06/17 at 03:57pm, Matt Fleming wrote:
> > > On Thu, 06 Jul, at 08:31:07AM, Naoya Horiguchi wrote:
> > > > + for (i = 0; i < nr_desc; i++) {
> > > > + md = (efi_memory_desc_t *)(pmap + (i * e->efi_memdesc_size));
> > > > +
> > > > + /*
> > > > + * EFI_BOOT_SERVICES_{CODE|DATA} are avoided because boot
> > > > + * services regions could be accessed after ExitBootServices()
> > > > + * due to the workaround for buggy firmware.
> > > > + */
> > > > + if (!(md->type == EFI_LOADER_CODE ||
> > > > + md->type == EFI_LOADER_DATA ||
> > > > + md->type == EFI_CONVENTIONAL_MEMORY))
> > > > + continue;
> > >
> > > Wouldn't it make more sense to *only* use EFI_CONVENTIONAL_MEMORY?
> > >
> > > You can't re-use EFI_LOADER_* regions because the kaslr code is run so
> > > early in boot that you've no idea if data the kernel will need is in
> > > those EFI_LOADER_* regions.
> > >
> > > For example, we pass struct setup_data objects inside of
> > > EFI_LOADER_DATA regions.
> >
> > It doesn't matter because we have tried to avoid those memory setup_data
> > resides in in mem_avoid_overlap(). Here discarding EFI_LOADER_* could
> > discard the whole regions while setup_data could occupy small part of
> > them.
>
> What about the GDT that we allocate in the x86 EFI boot stub as
> EFI_LOADER_DATA? Are there functions to avoid that too?

This is a very good question. For the current e820 processing, we don't
avoid GDT allocated in x86 EFI STUB because we have no information about
GDT in EFI STUB. You can see setup_e820() in boot/compressed/eboot.c
grabs EFI_LOADER_* regions as E820_TYPE_RAM. Now the GDT which is built
in EFI STUB code will live till kernel is ready to build its onw gdt(in
kernel/head_64.S). After that it become useless and can be reclaimed for
reusing. I believe Naoya must have read boot/compressed/eboot.c and
found setup_e820() code, then added EFI_LOADER_* for kaslr usage.

Previously I thought e820 should take the precedence to be processed
on uefi system because continuous efi memory regions will be merged into
e820 regions. Using e820 can avoid those small efi regions or the left
part of efi regions being discarded directly when they are smaller than
kernel image size. Now considering this GDT in efi stub issue, we have to
try efi regions first on uefi system. Otherwise it may cause problem that
kernel could be decompressed onto the GDT tables of efi stub.

In fact, I am wondering if we can reuse the gdt table which is built
before entering into long mode in boot/compressed/head_64.S, but not
allocate memory for GDT in efi stub. The thing is 32bit system doesn't
have this gdt table in boot compressing stage since it has one before
entering into protection mode. Just personal thought.
>
> What about any future uses we add? Who's going to remember to patch
> the kaslr code which now duplicates some of the EFI memory map logic?
>
> All of these problems can avoided if you just stick with
> EFI_CONVENTIONAL_MEMORY.

Below is the dmesg with 'efi=debug' adding on my ovmf uefi kvm guest.
Since uefi could do a lot of thing when loading OS, E.g loading into any
kind of storage driver application, firmware usually reserve a chunk of
memory of hunderes of Mega Bytes. Like this one:

******
[ +0.000000] efi: mem12: [Loader Data | ||WB|WT|WC|UC] range=[0x000000005c8eb000-0x000000007bfbdfff] (502MB)
******

We can't say it's a big deal, but 500MB is also not so trival that we
can easily ignore it without any consideration. Just uefi spec doesn't
define the limitation of Loader memory and Conventional memory and if
Conventional memory has to be present. With my understanding, there won't
be any problem if only Loader memory exists.

Further more, kaslr is not a precise searching job, no specific address
has to be positioned. Even it's OK that the physical address
randomization failed to find a new address randomly. So it's fine to me
that we don't take EFI_LOADER_* memory into consideration for kaslr. BUT
we need make this clear that why not, and if we can do anyting to make
it better.

[ +0.000000] efi: SMBIOS=0x7fed5000 ACPI=0x7ff03000 ACPI 2.0=0x7ff03014 MEMATTR=0x7ea50218
[ +0.000000] efi: mem00: [Boot Code | | | | | | | | |WB|WT|WC|UC] range=[0x0000000000000000-0x0000000000000fff] (0MB)
[ +0.000000] efi: mem01: [Loader Data | | | | | | | | |WB|WT|WC|UC] range=[0x0000000000001000-0x0000000000001fff] (0MB)
[ +0.000000] efi: mem02: [Conventional Memory| | | | | | | | |WB|WT|WC|UC] range=[0x0000000000002000-0x000000000009ffff] (0MB)
[ +0.000000] efi: mem03: [Conventional Memory| | | | | | | | |WB|WT|WC|UC] range=[0x0000000000100000-0x0000000000805fff] (7MB)
[ +0.000000] efi: mem04: [Boot Data | | | | | | | | |WB|WT|WC|UC] range=[0x0000000000806000-0x0000000000806fff] (0MB)
[ +0.000000] efi: mem05: [Conventional Memory| | | | | | | | |WB|WT|WC|UC] range=[0x0000000000807000-0x000000000081ffff] (0MB)
[ +0.000000] efi: mem06: [Boot Data | | | | | | | | |WB|WT|WC|UC] range=[0x0000000000820000-0x00000000012fffff] (10MB)
[ +0.000000] efi: mem07: [Conventional Memory| | | | | | | | |WB|WT|WC|UC] range=[0x0000000001300000-0x0000000001ffffff] (13MB)
[ +0.000000] efi: mem08: [Loader Data | | | | | | | | |WB|WT|WC|UC] range=[0x0000000002000000-0x0000000003614fff] (22MB)
[ +0.000000] efi: mem09: [Conventional Memory| | | | | | | | |WB|WT|WC|UC] range=[0x0000000003615000-0x000000003d6b3fff] (928MB)
[ +0.000000] efi: mem10: [Loader Data | | | | | | | | |WB|WT|WC|UC] range=[0x000000003d6b4000-0x000000003fffffff] (41MB)
[ +0.000000] efi: mem11: [Conventional Memory| | | | | | | | |WB|WT|WC|UC] range=[0x0000000040000000-0x000000005c8eafff] (456MB)
[ +0.000000] efi: mem12: [Loader Data | | | | | | | | |WB|WT|WC|UC] range=[0x000000005c8eb000-0x000000007bfbdfff] (502MB)
[ +0.000000] efi: mem13: [Boot Data | | | | | | | | |WB|WT|WC|UC] range=[0x000000007bfbe000-0x000000007bfddfff] (0MB)
[ +0.000000] efi: mem14: [Conventional Memory| | | | | | | | |WB|WT|WC|UC] range=[0x000000007bfde000-0x000000007e6effff] (39MB)
[ +0.000000] efi: mem15: [Loader Data | | | | | | | | |WB|WT|WC|UC] range=[0x000000007e6f0000-0x000000007e7e3fff] (0MB)
[ +0.000000] efi: mem16: [Loader Code | | | | | | | | |WB|WT|WC|UC] range=[0x000000007e7e4000-0x000000007e90afff] (1MB)
[ +0.000000] efi: mem17: [Conventional Memory| | | | | | | | |WB|WT|WC|UC] range=[0x000000007e90b000-0x000000007e914fff] (0MB)
[ +0.000000] efi: mem18: [Loader Data | | | | | | | | |WB|WT|WC|UC] range=[0x000000007e915000-0x000000007ea46fff] (1MB)
[ +0.000000] efi: mem19: [Boot Data | | | | | | | | |WB|WT|WC|UC] range=[0x000000007ea47000-0x000000007eb8dfff] (1MB)
[ +0.000000] efi: mem20: [Boot Code | | | | | | | | |WB|WT|WC|UC] range=[0x000000007eb8e000-0x000000007edd0fff] (2MB)
[ +0.000000] efi: mem21: [Runtime Data |RUN| | | | | | | |WB|WT|WC|UC] range=[0x000000007edd1000-0x000000007edd5fff] (0MB)
[ +0.000000] efi: mem22: [Runtime Code |RUN| | | | | | | |WB|WT|WC|UC] range=[0x000000007edd6000-0x000000007edddfff] (0MB)
[ +0.000000] efi: mem23: [Runtime Data |RUN| | | | | | | |WB|WT|WC|UC] range=[0x000000007edde000-0x000000007ede2fff] (0MB)
[ +0.000000] efi: mem24: [Runtime Code |RUN| | | | | | | |WB|WT|WC|UC] range=[0x000000007ede3000-0x000000007ede8fff] (0MB)
[ +0.000000] efi: mem25: [Runtime Data |RUN| | | | | | | |WB|WT|WC|UC] range=[0x000000007ede9000-0x000000007ee12fff] (0MB)
[ +0.000000] efi: mem26: [Runtime Code |RUN| | | | | | | |WB|WT|WC|UC] range=[0x000000007ee13000-0x000000007ee23fff] (0MB)
[ +0.000000] efi: mem27: [Boot Data | | | | | | | | |WB|WT|WC|UC] range=[0x000000007ee24000-0x000000007f46dfff] (6MB)
[ +0.000000] efi: mem28: [Conventional Memory| | | | | | | | |WB|WT|WC|UC] range=[0x000000007f46e000-0x000000007f477fff] (0MB)
[ +0.000000] efi: mem29: [Boot Data | | | | | | | | |WB|WT|WC|UC] range=[0x000000007f478000-0x000000007fd23fff] (8MB)
[ +0.000000] efi: mem30: [Conventional Memory| | | | | | | | |WB|WT|WC|UC] range=[0x000000007fd24000-0x000000007fd24fff] (0MB)
[ +0.000000] efi: mem31: [Boot Code | | | | | | | | |WB|WT|WC|UC] range=[0x000000007fd25000-0x000000007fea3fff] (1MB)
[ +0.000000] efi: mem32: [Runtime Code |RUN| | | | | | | |WB|WT|WC|UC] range=[0x000000007fea4000-0x000000007fed3fff] (0MB)
[ +0.000000] efi: mem33: [Runtime Data |RUN| | | | | | | |WB|WT|WC|UC] range=[0x000000007fed4000-0x000000007fef7fff] (0MB)
[ +0.000000] efi: mem34: [Reserved | | | | | | | | |WB|WT|WC|UC] range=[0x000000007fef8000-0x000000007fefbfff] (0MB)
[ +0.000000] efi: mem35: [ACPI Reclaim Memory| | | | | | | | |WB|WT|WC|UC] range=[0x000000007fefc000-0x000000007ff03fff] (0MB)
[ +0.000000] efi: mem36: [ACPI Memory NVS | | | | | | | | |WB|WT|WC|UC] range=[0x000000007ff04000-0x000000007ff07fff] (0MB)
[ +0.000000] efi: mem37: [Boot Data | | | | | | | | |WB|WT|WC|UC] range=[0x000000007ff08000-0x000000007ff6afff] (0MB)
[ +0.000000] efi: mem38: [Boot Code | | | | | | | | |WB|WT|WC|UC] range=[0x000000007ff6b000-0x000000007ff95fff] (0MB)
[ +0.000000] efi: mem39: [Boot Data | | | | | | | | |WB|WT|WC|UC] range=[0x000000007ff96000-0x000000007ffa6fff] (0MB)
[ +0.000000] efi: mem40: [Boot Code | | | | | | | | |WB|WT|WC|UC] range=[0x000000007ffa7000-0x000000007ffcffff] (0MB)
[ +0.000000] efi: mem41: [Runtime Data |RUN| | | | | | | |WB|WT|WC|UC] range=[0x000000007ffd0000-0x000000007ffeffff] (0MB)
[ +0.000000] efi: mem42: [Conventional Memory| | | | | | | | |WB|WT|WC|UC] range=[0x000000007fff0000-0x000000007fffffff] (0MB)
[ +0.000000] efi: mem43: [Runtime Data |RUN| | | | | | || | | |UC] range=[0x00000000ffe00000-0x00000000ffffffff] (2MB)

>
> Honestly, how much memory do we expect to waste if we ignore
> EFI_LOADER_* regions?
>
> Also, the fact that you're referencing EFI-specific boot quirks in the
> kaslr code should be a massive red flag that you're playing with the
> innards of the EFI subsystem.

Questions has been answered in above words.