Re: [PATCH 05/30] x86, kaiser: prepare assembly for entry/exit CR3 switching

From: Thomas Gleixner
Date: Mon Nov 20 2017 - 07:18:01 EST


On Fri, 10 Nov 2017, Dave Hansen wrote:
> From: Dave Hansen <dave.hansen@xxxxxxxxxxxxxxx>
>
> This is largely code from Andy Lutomirski. I fixed a few bugs
> in it, and added a few SWITCH_TO_* spots.
>
> KAISER needs to switch to a different CR3 value when it enters
> the kernel and switch back when it exits. This essentially
> needs to be done before leaving assembly code.
>
> This is extra challenging because the switching context is
> tricky: the registers that can be clobbered can vary. It is also
> hard to store things on the stack because there is an established
> ABI (ptregs) or the stack is entirely unsafe to use.

Changelog nitpicking starts here

> This patch establishes a set of macros that allow changing to

s/This patch establishes/Establish/

> the user and kernel CR3 values.
>
> Interactions with SWAPGS: previous versions of the KAISER code
> relied on having per-cpu scratch space to save/restore a register
> that can be used for the CR3 MOV. The %GS register is used to
> index into our per-cpu space, so SWAPGS *had* to be done before

s/our/the/

> the CR3 switch. That scratch space is gone now, but the semantic
> that SWAPGS must be done before the CR3 MOV is retained. This is
> good to keep because it is not that hard to do and it allows us

s/us//

> to do things like add per-cpu debugging information to help us
> figure out what goes wrong sometimes.

the part after 'information' is fairy tale mode and redundant. Debugging
information says it all, right?

> What this does in the NMI code is worth pointing out. NMIs
> can interrupt *any* context and they can also be nested with
> NMIs interrupting other NMIs. The comments below
> ".Lnmi_from_kernel" explain the format of the stack during this
> situation. Changing the format of this stack is not a fun
> exercise: I tried. Instead of storing the old CR3 value on the
> stack, this patch depend on the *regular* register save/restore
> mechanism and then uses %r14 to keep CR3 during the NMI. It is
> callee-saved and will not be clobbered by the C NMI handlers that
> get called.

The comments below ".Lnmi_from_kernel" explain the format of the stack
during this situation. Changing this stack format is too complex and
risky, so the following solution has been used:

Instead of storing the old CR3 value on the stack, depend on the regular
register save/restore mechanism and use %r14 to hold CR3 during the
NMI. r14 is callee-saved and will not be clobbered by the C NMI handlers
that get called.

End of nitpicking

> +.macro SAVE_AND_SWITCH_TO_KERNEL_CR3 scratch_reg:req save_reg:req
> + movq %cr3, %r\scratch_reg
> + movq %r\scratch_reg, \save_reg
> + /*
> + * Is the switch bit zero? This means the address is
> + * up in real KAISER patches in a moment.

* If the switch bit is zero, CR3 points at the kernel page tables
* already.
Hmm?

> /*
> @@ -1189,6 +1201,7 @@ ENTRY(paranoid_exit)
> testl %ebx, %ebx /* swapgs needed? */
> jnz .Lparanoid_exit_no_swapgs
> TRACE_IRQS_IRETQ
> + RESTORE_CR3 %r14

You have the named macro arguments everywhere, just not here.

Other than that.

Reviewed-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx>