Re: [tip:x86/boot] x86/boot/compressed/64: Prepare trampoline memory
From: Ingo Molnar
Date: Mon Feb 26 2018 - 03:47:17 EST
* Kirill A. Shutemov <kirill@xxxxxxxxxxxxx> wrote:
> > Also, could do a puts() hexdump of the affected memory area _before_ we overwrite
> > it? Is it empty? Could we add some debug warning that checks that it's all zeroes?
>
> The problem is that we don't really have a way get a message out of there.
Is there any memory area we can write to that survives into the real kernel?
> But without a way to print address it's still a black box.
> > 1)
> >
> > /* Check if LA57 is desired and supported */
> > if (IS_ENABLED(CONFIG_X86_5LEVEL) && native_cpuid_eax(0) >= 7 &&
> > (native_cpuid_ecx(7) & (1 << (X86_FEATURE_LA57 & 31))))
> > paging_config.l5_required = 1;
> >
> > ... it isn't explained why this feature CPU check is so complex.
>
> We check that the CPUID leaf is supported and than check the feature
> itself.
>
> Maybe the first check is redundant, but I tried to be safe here.
So the explanation that is missing is to explain which usual primitive this is
equivalent to. This is essentially an early boot variant of
boot_cpu_has(X86_FEATURE_LA57), right?
> > 2)
> >
> > + /* Place the trampoline just below the end of low memory, aligned to 4k */
> > + paging_config.trampoline_start = bios_start - TRAMPOLINE_32BIT_SIZE;
> > + paging_config.trampoline_start = round_down(paging_config.trampoline_start, PAGE_SIZE);
> >
> > placing trampolines just below or just above BIOS images is fragile. Instead a
> > better heuristic is to use the "middle" of suspected available RAM and work from
> > there.
>
> It's not obvious what is lower end of available memory here. Any hints?
>
> Realtime trampoline is allocated with top-down approach and I tried to
> mimic the approach here.
Yeah, it's all a bit weird - I'm grasping at straws really - and the large size of
the patch doesn't help.
> > + /* Clear trampoline memory first */
> > + memset(trampoline, 0, TRAMPOLINE_32BIT_SIZE);
> >
> > Memory bootup state is typically all zeroes (except maybe for kexec), so this
> > should check that what it's clearing doesn't contain any data.
>
> Hm. I don't see why would we expect this. Do we really have guarantee that
> bootloader would not mess with the memory?
Hm, probably not - what's the typical memory usage of GRUB for example?
> > It should probably also clear this memory _after_ use.
>
> After use I tired to restore the original content of the memory.
> See cleanup_trampoline(). That looks safer to me.
Yeah, agreed.
BTW., it's still a possibility that we are barking up the wrong tree here: if the
bug Boris triggered is somehow not fully deterministic (but say kernel build
alignment dependent - which alignment your commit changed) then this might be the
wrong commit. A split-up into several patches would help there too: for example if
the new data structures in the kernel image trigger the bug (with nothing else in
that commit) then it strongly suggests that it's alignment related, etc.
But it certainly 'feels' to have a chance to be related: stomping on the wrong
piece of memory can make graphics fail.
> > Why '+ _PAGE_TABLE_NOENC', while not ' |' ?
>
> It shouldn't really matter, but yeah, '|' is more appropriate.
Readability and using standard patterns is king.
Thanks,
Ingo