Re: [PATCH qemu] x86: don't let decompressed kernel image clobber setup_data
From: Ard Biesheuvel
Date: Mon Jan 02 2023 - 04:32:25 EST
On Mon, 2 Jan 2023 at 07:17, Borislav Petkov <bp@xxxxxxxxx> wrote:
>
> On Mon, Jan 02, 2023 at 07:01:50AM +0100, Borislav Petkov wrote:
> > On Sat, Dec 31, 2022 at 07:31:21PM -0800, H. Peter Anvin wrote:
> > > It would probably be a good idea to add a "maximum physical address for
> > > initrd/setup_data/cmdline" field to struct kernel_info, though. It appears
> > > right now that those fields are being identity-mapped in the decompressor,
> > > and that means that if 48-bit addressing is used, physical memory may extend
> > > past the addressable range.
> >
> > Yeah, we will probably need that too.
> >
> > Btw, looka here - it can't get any more obvious than that after dumping
> > setup_data too:
> >
> > early console in setup code
> > early console in extract_kernel
> > input_data: 0x00000000040f92bf
> > input_len: 0x0000000000f1c325
> > output: 0x0000000001000000
> > output_len: 0x0000000003c5e7d8
> > kernel_total_size: 0x0000000004428000
> > needed_size: 0x0000000004600000
> > boot_params->hdr.setup_data: 0x00000000010203b0
> > trampoline_32bit: 0x000000000009d000
> >
> > Decompressing Linux... Parsing ELF... done.
> > Booting the kernel.
> > <EOF>
> >
> > Aligning them vertically:
> >
> > output: 0x0000000001000000
> > output_len: 0x0000000003c5e7d8
> > kernel_total_size: 0x0000000004428000
> > needed_size: 0x0000000004600000
> > boot_params->hdr.setup_data: 0x00000000010203b0
>
> Ok, waait a minute:
>
> ============ ============
> Field name: pref_address
> Type: read (reloc)
> Offset/size: 0x258/8
> Protocol: 2.10+
> ============ ============
>
> This field, if nonzero, represents a preferred load address for the
> kernel. A relocating bootloader should attempt to load at this
> address if possible.
>
> A non-relocatable kernel will unconditionally move itself and to run
> at this address.
>
> so a kernel loader (qemu in this case) already knows where the kernel goes:
>
> boot_params->hdr.setup_data: 0x0000000001020450
> boot_params->hdr.pref_address: 0x0000000001000000
> ^^^^^^^^^^^^^^^^^
>
> now, considering that same kernel loader (qemu) knows how big that kernel is:
>
> kernel_total_size: 0x0000000004428000
>
> should that loader *not* put anything that the kernel will use in the range
>
> pref_addr + kernel_total_size
>
This seems to be related to another issue that was discussed in the
context of this change, but affecting EFI boot not legacy BIOS boot
[0].
So, in a nutshell, we have the following pieces:
- QEMU, which manages a directory of files and other data blobs, and
exposes them via its fw_cfg interface.
- SeaBIOS, which invokes the fw_cfg interface to load the 'kernel'
blob at its preferred address
- The boot code in the kernel, which interprets the various fields in
the setup header to figure out where the compressed image lives etc
So the problem here, which applies to SETUP_DTB as well as
SETUP_RNG_SEED, is that the internal file representation of the kernel
blob (which does not have an absolute address at this point, it's just
a file in the fw_cfg filesystem) is augmented with:
1) setup_data linked-list entries carrying absolute addresses that are
assumed to be valid once SeaBIOS loads the file to memory
2) DTB and/or RNG seed blobs appended to the compressed 'kernel' blob,
but without updating that file's internal metadata
Issue 1) is what broke EFI boot, given that EFI interprets the kernel
blob as a PE/COFF image and hands it to the Loadimage() boot service,
which has no awareness of boot_params or setup_data and so just
ignores it and loads the image at an arbitrary address, resulting in
setup_data absolute address values pointing to bogus places.
It seems that now, we have another issue 2), where the fw_cfg view of
the file size goes out of sync with the compressed image's own view of
its size.
As a fix for issue 1), we explored another solution, which was to
allocate fixed areas in memory for the RNG seed, so that the absolute
address added to setup_data is guaranteed to be correct regardless of
where the compressed image is loaded, but that was shot down for other
reasons, and we ended up enabling this feature only for legacy BIOS
boot. But apparently, this approach has other issues so perhaps it is
better to revisit that solution again.
So instead of appending data to the compressed image and assuming that
it will stay in place, create or extend a memory reservation
elsewhere, and refer to its absolute address in setup_data.
--
Ard.
[0] https://lore.kernel.org/all/CAMj1kXFr6Bv4_G0-wCTu4fp_iCrG060NHJx_j2dbnyiFJKYYeQ@xxxxxxxxxxxxxx/