Re: [PATCHv4 4/4] x86/boot/compressed/64: Handle 5-level paging boot if kernel is above 4G
From: Ingo Molnar
Date: Thu Dec 07 2017 - 02:03:38 EST
* Kirill A. Shutemov <kirill.shutemov@xxxxxxxxxxxxxxx> wrote:
> This patch addresses a shortcoming in current boot process on machines
> that supports 5-level paging.
s/in current boot process
/in the current boot process
> If a bootloader enables 64-bit mode with 4-level paging, we might need to
> switch over to 5-level paging. The switching requires disabling paging.
> It works fine if kernel itself is loaded below 4G.
s/The switching requires disabling paging.
/The switching requires the disabling of paging.
> But if the bootloader put the kernel above 4G (not sure if anybody does
> this), we would loose control as soon as paging is disabled as code
> becomes unreachable.
Same suggestion for this paragraph as for the previous patch.
> This patch implements a trampoline in lower memory to handle this
> situation.
>
> We only need the memory for very short time, until main kernel image
> would setup its own page tables.
s/We only need the memory for very short time
/We only need this memory for a very short time
s/until main kernel image would setup its own page tables.
/until the main kernel image sets up its own page tables.
> We go through trampoline even if we don't have to: if we're already in
> 5-level paging mode or if we don't need to switch to it. This way the
> trampoline gets tested on every boot.
s/We go through trampoline
/We go through the trampoline
> /*
> * At this point we are in long mode with 4-level paging enabled,
> + * but we might want to enable 5-level paging.
> *
> + * The problem is that we cannot do it directly. Setting CR.LA57
> + * in the long mode would trigger #GP. So we need to switch off
> + * long mode and paging first.
s/Setting CR.LA57 in the long mode
/Setting CR.LA57 in long mode
Also, why only say 'CR', why not 'CR4'?
> + *
> + * We also need a trampoline in lower memory to switch over from
> + * 4- to 5-level paging for cases when bootloader put kernel above
> + * 4G, but didn't enable 5-level paging for us.
s/for cases when bootloader put kernel above 4G
/for cases when the bootloader puts the kernel above 4G
> + *
> + * For the trampoline, we need top page table in lower memory as
> + * we don't have a way to load 64-bit value into CR3 from 32-bit
> + * mode.
s/we need top page table in lower memory
/we need the top page table to reside in lower memory
s/load 64-bit value into CR3 from 32-bit mode
/load 64-bit values into CR3 in 32-bit mode
> + *
> + * We go though the trampoline even if we don't have to: if we're
> + * already in 5-level paging mode or if we don't need to switch to
> + * it. This way the trampoline code gets tested on every boot.
> /*
> + * Load address of trampoline_return into RDI.
> + * It will be used by trampoline to return to main code.
s/Load address of trampoline_return
/Load the address of trampoline_return
s/It will be used by trampoline to return to main code
/It will be used by the trampoline to return to the main code
> +trampoline_return:
> + /* Restore stack, 32-bit trampoline uses own stack */
> + leaq boot_stack_end(%rbx), %rsp
This phrasing would be a bit clearer:
/* Restore the stack, the 32-bit trampoline uses its own stack */
> +/*
> + * This is 32-bit trampoline that will be copied over to low memory.
s/This is 32-bit trampoline that will be
/This is the 32-bit trampoline that will be
> + *
> + * RDI contains return address (might be above 4G).
s/RDI contains return address (might be above 4G)
/RDI contains the return address (might be above 4G)
> + * ECX contains the base address of trampoline memory.
s/of trampoline memory
/of the trampoline memory
> + /* For 5-level paging, point CR3 to trampoline's new top level page table */
s/point CR3 to trampoline's new top level page table
/point CR3 to the trampoline's new top level page table
> + testl $1, %ecx
> + jz 1f
> + leal TRAMPOLINE_32BIT_PGTABLE_OFF (%edx), %eax
BTW., could you please spell out 'OFFSET'? 'OFF' reminds me of the ON/OFF pattern.
The constant is hopelessly long anyway ;-)
Also:
s/TRAMPOLINE_32BIT_PGTABLE_OFF (%edx)
/TRAMPOLINE_32BIT_PGTABLE_OFF(%edx)
(This applies to the rest of the patch as well.)
> - /* Enable PAE and LA57 mode */
> + /* Enable PAE and LA57 (if required) modes */
A bit more clarity:
s/modes
/paging modes
> + /* Calculate address of paging_enabled once we are in trampoline */
Please use the same function name reference as I suggested for the previous patch.
s/once we are in trampoline
/once we are executing in the trampoline
> /* Prepare stack for far return to Long Mode */
s/Prepare stack
/Prepare the stack
> /* Enable paging back */
s/Enable paging back
/Enable paging again
> + .code64
> +paging_enabled:
> + /* Return from the trampoline */
> + jmp *%rdi
> +
> + /*
> + * Bound size of trampoline code.
> + * It would fail to compile if code of the trampoline would grow
> + * beyond TRAMPOLINE_32BIT_CODE_SIZE bytes.
How about:
* The trampoline code has a size limit.
* Make sure we fail to compile if the trampoline code grows
* beyond TRAMPOLINE_32BIT_CODE_SIZE bytes.
?
> + .code32
> no_longmode:
> /* This isn't an x86-64 CPU so hang */
While at it:
s/This isn't an x86-64 CPU so hang
/This isn't an x86-64 CPU, so hang intentionally, we cannot continue:
Thanks,
Ingo