Re: [PATCH v3 1/5] x86/kexec: do unconditional WBINVD for bare-metal in stop_this_cpu()

From: Kirill A. Shutemov
Date: Thu Apr 11 2024 - 09:32:02 EST


On Thu, Apr 11, 2024 at 09:54:13AM +1200, Huang, Kai wrote:
>
>
> On 11/04/2024 2:12 am, Kirill A. Shutemov wrote:
> > On Mon, Apr 08, 2024 at 12:44:54AM +1200, Kai Huang wrote:
> > > TL;DR:
> >
> > The commit message is waaay too verbose for no good reason. You don't
> > really need to repeat all the history around this code.
>
> Could you be more specific?
>
> I was following Boris's suggestion to summerize all the discussion around
> the "unconditional WBINVD" issue.
>
> https://lore.kernel.org/linux-kernel/20240228110207.GCZd8Sr8mXHA2KTiLz@fat_crate.local/
>
> I can try to improve if I can know specifically what should be trimmed down.

What about something like this:

x86/mm: Do unconditional WBINVD in stop_this_cpu() for bare metal

Both AMD SME and Intel TDX can leave caches in an incoherent state due to
memory encryption, which can lead to silent memory corruption during kexec. To
address this issue, it is necessary to flush the caches before jumping to the
second kernel.

Previously, the kernel only performed WBINVD in stop_this_cpu() when SME
support was detected. To support TDX as well, instead of adding vendor-specific
checks, it is proposed to unconditionally perform WBINVD. Kexec() is a slow
path, and the additional WBINVD is acceptable for the sake of simplicity and
maintainability.

It is important to note that WBINVD should only be done for bare-metal
scenarios, as TDX guests and SEV-ES/SEV-SNP guests may not handle unexpected
exceptions (#VE or #VC) caused by WBINVD.

Historically, there were issues with unconditional WBINVD, leading to system
hangs or resets on different Intel systems. These issues were addressed by a
series of commits, culminating in the fix provided by commit 1f5e7eb7868e
("x86/smp: Make stop_other_cpus() more robust").

Further testing on problematic machines confirmed that the issues could not be
reproduced after applying the fix. Therefore, it is now safe to unconditionally
perform WBINVD in stop_this_cpu().

You can also add links to relevant threads as Link: tags.

--
Kiryl Shutsemau / Kirill A. Shutemov