Re: [PATCH qemu] x86: don't let decompressed kernel image clobber setup_data

From: Jason A. Donenfeld
Date: Wed Dec 28 2022 - 11:33:46 EST


On Wed, Dec 28, 2022 at 05:02:22PM +0100, Philippe Mathieu-Daudé wrote:
> Hi Jason,
>
> On 28/12/22 15:38, Jason A. Donenfeld wrote:
> > The setup_data links are appended to the compressed kernel image. Since
> > the kernel image is typically loaded at 0x100000, setup_data lives at
> > `0x100000 + compressed_size`, which does not get relocated during the
> > kernel's boot process.
> >
> > The kernel typically decompresses the image starting at address
> > 0x1000000 (note: there's one more zero there than the decompressed image
*compressed image

> > + uint32_t target_address = ldl_p(setup + 0x258);
>
> Nitpicking, can the Linux kernel add these magic values in
> arch/x86/include/uapi/asm/bootparam.h? Or can we use
> offsetof(setup_header) to get them?

I suspect the reason that x86.c has always had those hardcoded offsets
is because this is how it's documented in Documentation/x86/boot.rst?

> > + if ((start_setup_data >= start_target && start_setup_data < end_target) ||
> > + (end_setup_data >= start_target && end_setup_data < end_target)) {
> > + uint32_t padded_size = target_address + decompressed_length - prot_addr;
> > +
> > + /* The early stage can't address past around 64 MB from the original
> > + * mapping, so just give up in that case. */
> > + if (padded_size < 62 * 1024 * 1024)
>
> You mention 64 but check for 62, is that expected? You can use the MiB
> definitions to ease code review: 64 * MiB.

62 is intentional. But I'm still not really sure what's up. 63 doesn't
work. I haven't totally worked out why this is, or why the 64 MiB limit
exists in the first place. Maybe because this is a very early mapping
set up by real mode? Or because another mapping is placed over it that's
executable? There's that 2MiB*4096 gdt entry, but that'd cover all 4
gigs. So I really don't know yet. I'll continue to poke at it, but on
the off chance somebody here understands what's up, that'd save me a
bunch of head scratching.

> Fix looks good, glad you figured out the problem.

I mean, kind of. The solution here sucks, especially given that in the
worst case, setup_data just gets dropped. I'm half inclined to consider
this a kernel bug instead, and add some code to relocate setup_data
prior to decompression, and then fix up all the links. It seems like
this would be a lot more robust.

I just wish the people who wrote this stuff would chime in. I've had
x86@xxxxxxxxxx CC'd but so far, no input from them.

Jason