Re: [PATCH v3 1/3] Documentation: arm: add UEFI support documentation

From: Matt Sealey
Date: Fri Dec 06 2013 - 12:20:54 EST


On Wed, Dec 4, 2013 at 4:44 PM, Matthew Garrett <mjg59@xxxxxxxxxxxxx> wrote:
> On Wed, Dec 04, 2013 at 03:06:47PM -0600, Matt Sealey wrote:
>
>> there's no guarantee that the kernel hasn't been decompressed over
>> some important UEFI feature or some memory hasn't been trashed. You
>> can't make that guarantee because by entering the plain zImage, you
>> forfeited that information.
>
> The stub is responsible for ensuring that the compressed kernel is
> loaded at a suitable address. Take a look at efi_relocate_kernel().

My objection is the suitable address is based on a restriction that
booting from UEFI doesn't have and information UEFI provides that
makes kernel features from head.S (both of them) easier to get around.
The kernel doesn't need to be within a particular range of the start
of memory, nor does the device tree or ramdisk require being in a
particular place. What the code before efi_relocate_kernel does is
allocate a maximum-sized-buffer to safely decompress in, which is just
a gross way to do it, then crosses it's fingers based on the way it
has historically worked - while you might want to assume that the
decompression process is quite well defined and reliable, I keep
seeing patches come in that stop it from doing weird unsavory behavior
- for example decompressing over it's own page table.

The decompressor - and the kernel head it jumps to after decompression
- *guess* all the information UEFI could have provided and completely
regenerate the environment for the decompressor itself (stacks, hacky
memory allocations, cache on, off, on, off, on... fudging locations of
page tables, zreladdr fixup, low level debug message output, in
context of UEFI - reimplementation of memcpy, memset). It forfeits a
more controlled and lean boot process to capitulate to a historical
legacy. Since you're taking over the decompressor head.S anyway, why
not take control of the decompression process?

It sets up a page table location the hard way (as above.. also patched
recently not to decompress over it's own page table). It doesn't need
to relocate itself past the end of the decompressed image. It doesn't
need to set up the C environment - UEFI did that for it. It makes
assumptions about the stack and hacks memory allocations for the
decompression.. it turns the cache on, decompresses, then turns it off
again... you can just walk through the code under the EFI stub in
compressed/head.S and see all this can just fall away.

There's one immediate advantage too, if it's actually implemented and
working, which is that for kernel images that are compressed using the
standard UEFI compression method no actual decompression code needs to
be added to the stub, and the functionality gets the exact length of
the required decompression buffer.. that doesn't reduce flexibility in
kernel compression as long as there is still the possibility of adding
additional compression code to the stub.

The second immediate advantage is that the EFI stub/decompressor can
actually verify that the *decompressed* image meets Secure Boot
requirements.

Once you get past the decompressor and into the kernel proper head.S,
creating the page tables (again) and turning the MMU on, pv table
patching.. if you still had the information around, that gets simpler
too.

Grant suggested I should propose some patches; sure, if I'm not otherwise busy.

Maybe the Linaro guys can recommend a platform (real or emulated) that
would be best to test it on with the available UEFI?

>> Most of the guessing is ideally not required to be a guess at all, the
>> restrictions are purely to deal with the lack of trust for the
>> bootloader environment. Why can't we trust UEFI? Or at least hold it
>> to a higher standard. If someone ships a broken UEFI, they screw a
>> feature or have a horrible bug and ship it, laud the fact Linux
>> doesn't boot on it and the fact that it's their fault - over their
>> head. It actually works these days, Linux actually has "market share,"
>> companies really go out of their way to rescue their "image" and
>> resolve the situation when someone blogs about a serious UEFI bug on
>> their $1300 laptops, or even $300 tablets.
>
> Yeah, that hasn't actually worked out too well for us.

Aside from Teething problems caused by a rush to market ;)

For the "ARM server market" rather than the "get the cheapest
tablet/ultrabook out of the door that runs Windows 8/RT" I am sure
this is going to get to be VERY important for vendors to take into
account. Imagine if Dell shipped a *server* where Linux would brick it
out of the box just for setting a variable.. however, if it works the
day they ship the server, and Linux gains better support for UEFI
booting which breaks the server in question, that's our fault for not
doing it in the right way in the first place, and Dell can be just as
angry at us as we would be at them. Vendors won't test code that
doesn't exist for obvious reasons.

This is what I was trying to get at about them not updating their
firmware support for the more firmware-aware method if it works with
the "ditch firmware early" method worked well for them (which means
the functionality in the firmware never gets stressed and the
self-fulfilling prophecy of untrustworthy firmware vendors persists).
That firmware quality assurance - if not the code itself - will
trickle down to consumer tablets and ARM thin laptop kind of devices.

Ta,
Matt Sealey <neko@xxxxxxxxxxxxx>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/