Re: [PATCH v3 2/5] x86/kexec: do unconditional WBINVD for bare-metal in relocate_kernel()

From: Kirill A. Shutemov
Date: Wed Apr 10 2024 - 10:15:26 EST


On Mon, Apr 08, 2024 at 12:44:55AM +1200, Kai Huang wrote:
> Both SME and TDX can leave caches in incoherent state due to memory
> encryption. During kexec, the caches must be flushed before jumping to
> the second kernel to avoid silent memory corruption to the second kernel.
>
> During kexec, the WBINVD in stop_this_cpu() flushes caches for all
> remote cpus when they are being stopped. For SME, the WBINVD in
> relocate_kernel() flushes the cache for the last running cpu (which is
> executing the kexec).
>
> Similarly, to support kexec for TDX host, after stopping all remote cpus
> with cache flushed, the kernel needs to flush cache for the last running
> cpu.
>
> Use the existing WBINVD in relocate_kernel() to cover TDX host as well.
>
> However, instead of sprinkling around vendor-specific checks, just do
> unconditional WBINVD to cover both SME and TDX. Kexec is not a fast path
> so having one additional WBINVD for platforms w/o SME/TDX is acceptable.
>
> But only do WBINVD for bare-metal because TDX guests and SEV-ES/SEV-SNP
> guests will get unexpected (and yet unnecessary) #VE which the kernel is
> unable to handle at this stage.
>
> Signed-off-by: Kai Huang <kai.huang@xxxxxxxxx>
> Cc: Tom Lendacky <thomas.lendacky@xxxxxxx>
> Cc: Dave Young <dyoung@xxxxxxxxxx>

Reviewed-by: Kirill A. Shutemov <kirill.shutemov@xxxxxxxxxxxxxxx>

--
Kiryl Shutsemau / Kirill A. Shutemov