Re: [RFCv2 0/9] UEFI emulator for kexec
From: Pingfan Liu
Date: Mon Sep 02 2024 - 01:40:58 EST
On Thu, Aug 29, 2024 at 1:08 AM Ard Biesheuvel <ardb@xxxxxxxxxx> wrote:
>
[...]
Hi Ard,
Thanks for sharing your insight and thoughts.
>
> Thanks for putting this RFC together. This is useful work, and gives
> us food for thought and discussion.
>
> There are a few problems that become apparent when going through these changes.
>
> 1. Implementing UEFI entirely is intractable, and unnecessary.
> Implementing the subset of UEFI that is actually needed to boot Linux
> *is* tractable, though, but we need to work together to write this
> down somewhere.
> - the EFI stub needs the boot services for the EFI memory map and
> the allocation routines
> - GRUB needs block I/O
> - systemd-stub/UKI needs file I/O to look for sidecars
> - etc etc
>
> I implemented a Rust 'efiloader' crate a while ago that encapsulates
> most of this (it can boot Linux/arm64 on QEMU and boot x86 via GRUB in
> user space **). Adding file I/O to this should be straight-forward -
> as Lennart points out, we only need the protocol, it doesn't need to
> be backed by an actual file system, it just needs to be able to expose
> other files in the right way.
>
> 2. Running the UEFI emulator on bare metal is not going to scale.
> Cloning UART driver code and MMU code etc is a can of worms that you
> want to leave closed. And as Lennart points out, there is other
As for MMU code, if the 1st kernel does not turn it off, it can be
eliminated from the emulator code, which should not be hard to
implement on arm64. And already done in x86.
> hardware (TPM) that needs to be accessible as well. Providing a
> separate set of drivers for all hardware that the EFI emulator may
> need to access is not a tractable problem either.
>
> The fix for this, as I see it, is to run the EFI emulator in user
> space, to the point where the payload calls ExitBootServices(). This
> will allow all I/O and memory protocol to be implemented trivially,
> using C library routines. I have a crude prototype** of this running
Yes, that is a definitely promising method, By this way, we can handle
device operations more elegantly. In fact, I used it to develop and
debug part of my emulator service code.
But when debugging x86 efi-stub, I encounter some problem with the
privileged instruction, which causes segment fault. It originates from
kaslr_get_random_long(). I think it can be worked around by emulating
the instruction if the instruction reads the system state. But if the
instruction tries to update system state, it can not be fixed since
the system is still owned by the kernel instead of owned by the
emulator exclusively.
So here we need another agreement on the stub's behavior before
ExitBootServices().
> to the point where ExitBootServices() is called (and then it crashes).
> The tricky yet interesting bit here is how we migrate a chunk of user
> space memory to the bare metal context that will be created by the
> kexec syscall later (in which the call to ExitBootServices() would
> return and proceed with the boot). But the principle is rather
> straight-forward, and would permit us, e.g., to kexec an OS installer
> too.
>
> 3. We need to figure out how to support TPM and PCRs in the context of
> kexec. This is a fundamental issue with verified boot, given that the
> kexec PCR state is necessarily different from the boot state, and so
> we cannot reuse the TPM directly if we want to pretend that we are
> doing an ordinary boot in kexec. The alternative is to leave the TPM
Here, I miss the big picture. Could you enlighten me more about this?
As I thought, the linux kernel will not lock itself down onto a
specific firmware. So the trust is one direction, i.e. from bootloader
to kernel. In UKI case, systemd-stub takes the measurement and extends
the PCR 11/12/13 as in
https://uapi-group.org/specifications/specs/linux_tpm_pcr_registry/
Later systemd-pcrlock appraises the value in those registers. If the
sections in UKI are intact, the kexec reboot will go smoothly.
> in a state where the kexec kernel can access its sealed secrets, and
> mock up the TCG2 EFI protocols using a shim that sits between the TPM
> hardware (as the real TCG2 protocols will be long gone) and the EFI
> payload. But as I said, this is a fundamental issue, as the ability to
> pretend that a kexec boot is a pristine boot would mean that verified
> boot is broken.
>
>
> As future work, I'd like to propose to collaborate on some alignment
> regarding a UEFI baseline for Linux, i.e., the parts that we actually
> need to boot Linux.
>
Do you mean that user space code and kernel code? And I think for the
user space code, it should be better to integrate the code in
kexec-tools so that we have a uniform interface for kexec boot.
Looking forward to the collaboration to make kexec able to boot UKI soon.
Thanks,
Pingfan