Re: [PATCH 2/4] x86/virt/tdx: Pull kexec cache flush logic into arch/x86

From: Huang, Kai

Date: Sun Mar 08 2026 - 20:23:52 EST


On Fri, 2026-03-06 at 17:03 -0800, Rick Edgecombe wrote:
> KVM tries to take care of some required cache flushing earlier in the
> kexec path in order to be kind to some long standing races that can occur
> later in the operation. Until recently, VMXOFF was handled within KVM.
> Since VMX being enabled is required to make a SEAMCALL, it had the best
> per-cpu scoped operation to plug the flushing into.
>
> This early kexec cache flushing in KVM happens via a syscore shutdown
> callback. Now that VMX enablement control has moved to arch/x86, which has
> grown its own syscore shutdown callback, it no longer make sense for it to
> live in KVM. It fits better with the TDX enablement managing code.

[...]

>
> In addition, future changes will add a SEAMCALL that happens immediately
> before VMXOFF, which means the cache flush in KVM will be too late to be
> helpful. So move it to the newly added TDX arch/x86 syscore shutdown
> handler.

Nit: I am not sure how to interpret "too late to be helpful". I think we
can just get rid of this paragraph.

>
> Since tdx_cpu_flush_cache_for_kexec() is no longer needed by KVM, make it
> static and remove the export. Since it is also not part of an operation
> spread across disparate components, remove the redundant comments and
> verbose naming.
>
> Signed-off-by: Rick Edgecombe <rick.p.edgecombe@xxxxxxxxx>

Feel free to add:

Acked-by: Kai Huang <kai.huang@xxxxxxxxx>

Btw, there's a functional change here, and perhaps we should call out in
changelog:

- Currently tdx_cpu_flush_cache_for_kexec() is done in
kvm_disable_virtualization_cpu(), which is also called by KVM's CPUHP
offline() callback. So tdx_cpu_flush_cache_for_kexec() is explicitly done
in TDX code in CPU offline.

- With this change, tdx_cpu_flush_cache_for_kexec() is not explicitly done
in TDX code in CPU offline.

But AFAICT this is fine, since IIUC the WBINVD is always done when kernel
offlines one CPU (see [*]), i.e., the current
tdx_cpu_flush_cache_for_kexec() done in KVM's CPUHP is actually superfluous.

[*] See:

native_play_dead() ->
cpuidle_play_dead();
hlt_play_dead();

cpuidle_play_dead() can invoke different enter_dead() callbacks depending on
what idle driver is being used, but AFAICT eventually it ends up calling
either acpi_idle_play_dead() or mwait_play_dead(), both of which does WBINVD
before going to idle.

If cpuidle_play_dead() doesn't idle successfully, the hlt_play_dead() will
then WBINVD and hlt.

Actually, after looking at multiple commits around here, e.g.,

ea53069231f93 ("x86, hotplug: Use mwait to offline a processor, fix the
legacy case")
dfbba2518aac4 ("Revert "ACPI: processor: idle: Only flush cache on
entering C3")

... I believe it's a kernel policy to make sure cache is flushed when it
offlines a CPU (which makes sense anyway of course), I just couldn't find
the exact commit saying this (or I am not sure whether there's such commit).


Btw2, kinda related to this, could you help review:

https://lore.kernel.org/lkml/20260302102226.7459-1-kai.huang@xxxxxxxxx/