Re: [PATCH] x86/kexec: Only write through identity mapping of control page

From: Dave Hansen
Date: Thu Dec 12 2024 - 16:44:14 EST


On 12/12/24 13:32, David Woodhouse wrote:
> On 12 December 2024 21:18:10 GMT, Dave Hansen <dave.hansen@xxxxxxxxx> wrote:
>> On 12/12/24 12:11, David Woodhouse wrote:
>>> From: David Woodhouse <dwmw@xxxxxxxxxxxx>
>>>
>>> The virtual mapping of the control page may have been _PAGE_GLOBAL and
>>> thus its PTE might not have been flushed on the %cr3 switch and
>>> it might effectively still be read-only. Move the writes to it
>>> down into the identity_mapped() function where the same
>>> %rip-relative addressing will get the new mapping.
>>>
>>> The stack is fine, as that's using the identity mapped address
>>> anyway.
>>
>> Shouldn't we also ensure that Global entries don't bite anyone
>> else? Something like the completely untested attached patch?
> Doesn't hurt, but this is an identity mapping so absolutely
> everything other than this one page is going to be in the low
> (positive) part of the canonical address space, so won't have had
> global pages in the first place will they?

Right, it's generally _not_ a problem. But it _can_ be a surprising
problem which is why we're all looking at it today. ;)

> Probably a kind thing to do for whatever we're passing control to
> though :)
>
> I'll round it up into the tree and send it out with the next batch of
> debug support. Care to give me a SoB for it? You can
> s/CR0_PGE/CR4_PGE/ too if you like but I can do that myself as well.
Here's a fixed one with a changelog and a SoB. Still 100% gloriously
untested though.From 3513c089e4d281fa932d2b3245443645c1c44c53 Mon Sep 17 00:00:00 2001
From: Dave Hansen <dave.hansen@xxxxxxxxxxxxxxx>
Date: Thu, 12 Dec 2024 13:35:14 -0800
Subject: [PATCH] x86/mm: Ensure Global mappings are zapped during kexec

The kernel switches to a new set of page tables during kexec. The
global mappings (_PAGE_GLOBAL==1) can remain in the TLB after this
switch. This is generally not a problem because the new page tables
use a different portion of the virtual address space than the normal
kernel mappings.

But there's no good reason to leave the old TLB entries around. They
can cause nothing but trouble. Clear "Page Global Enable"
(X86_CR4_PGE). This, along with the CR3 write ensures that there is
no trace of the old page tables in the TLB, even global entries.

Signed-off-by: Dave Hansen <dave.hansen@xxxxxxxxxxxxxxx>
---
arch/x86/kernel/relocate_kernel_64.S | 2 ++
1 file changed, 2 insertions(+)

diff --git a/arch/x86/kernel/relocate_kernel_64.S b/arch/x86/kernel/relocate_kernel_64.S
index e9e88c342f752..87fc788fa67b2 100644
--- a/arch/x86/kernel/relocate_kernel_64.S
+++ b/arch/x86/kernel/relocate_kernel_64.S
@@ -155,6 +155,8 @@ SYM_CODE_START_LOCAL_NOALIGN(identity_mapped)
*/
andl $(X86_CR4_PAE | X86_CR4_LA57), %r13d
ALTERNATIVE "", __stringify(orl $X86_CR4_MCE, %r13d), X86_FEATURE_TDX_GUEST
+ /* Invalidate Global entries from the TLB: */
+ andq $~(X86_CR4_PGE), %r13d
movq %r13, %cr4

/* Flush the TLB (needed?) */
--
2.34.1