Re: [PATCH v2 2/5] x86/kexec: do unconditional WBINVD in relocate_kernel()

From: Tom Lendacky
Date: Wed Mar 20 2024 - 09:50:15 EST


On 3/19/24 16:20, Huang, Kai wrote:


On 20/03/2024 3:38 am, Tom Lendacky wrote:
On 3/19/24 06:13, Kirill A. Shutemov wrote:
On Tue, Mar 19, 2024 at 01:48:45AM +0000, Kai Huang wrote:
Both SME and TDX can leave caches in incoherent state due to memory
encryption.  During kexec, the caches must be flushed before jumping to
the second kernel to avoid silent memory corruption to the second kernel.

During kexec, the WBINVD in stop_this_cpu() flushes caches for all
remote cpus when they are being stopped.  For SME, the WBINVD in
relocate_kernel() flushes the cache for the last running cpu (which is
executing the kexec).

Similarly, for TDX after stopping all remote cpus with cache flushed, to
support kexec, the kernel needs to flush cache for the last running cpu.

Make the WBINVD in the relocate_kernel() unconditional to cover both SME
and TDX.

Nope. It breaks TDX guest. WBINVD triggers #VE for TDX guests.

Ditto for SEV-ES/SEV-SNP, a #VC is generated and crashes the guest.


Oh I forgot these.

Hi Kirill,

Then I think patch 1 will also break TDX guest after your series to enable multiple cpus for the second kernel after kexec()?

Hi Tom,

I am not aware of kexec() support status for SEV-ES/SEV-SNP guests. Does patch 1 break them?

SNP guests can kexec with some patches that are currently in process around shared to private memory conversions. ES guests can only kexec with a single vCPU. There was a recent patch series to add support for multiple vCPUs.

Patch #1 doesn't break either ES or SNP because we still have an IDT and traditional kernel addressing in place, so the #VC can be handled.

Whereas patch #2 has switched to identity mapping and removed the IDT, so a #VC causes a triple fault.

Thanks,
Tom