Re: [PATCH 2/4] x86/virt/tdx: Pull kexec cache flush logic into arch/x86
From: Huang, Kai
Date: Sun Mar 08 2026 - 20:23:52 EST
On Fri, 2026-03-06 at 17:03 -0800, Rick Edgecombe wrote:
> KVM tries to take care of some required cache flushing earlier in the
> kexec path in order to be kind to some long standing races that can occur
> later in the operation. Until recently, VMXOFF was handled within KVM.
> Since VMX being enabled is required to make a SEAMCALL, it had the best
> per-cpu scoped operation to plug the flushing into.
>
> This early kexec cache flushing in KVM happens via a syscore shutdown
> callback. Now that VMX enablement control has moved to arch/x86, which has
> grown its own syscore shutdown callback, it no longer make sense for it to
> live in KVM. It fits better with the TDX enablement managing code.
[...]
>
> In addition, future changes will add a SEAMCALL that happens immediately
> before VMXOFF, which means the cache flush in KVM will be too late to be
> helpful. So move it to the newly added TDX arch/x86 syscore shutdown
> handler.
Nit: I am not sure how to interpret "too late to be helpful". I think we
can just get rid of this paragraph.
>
> Since tdx_cpu_flush_cache_for_kexec() is no longer needed by KVM, make it
> static and remove the export. Since it is also not part of an operation
> spread across disparate components, remove the redundant comments and
> verbose naming.
>
> Signed-off-by: Rick Edgecombe <rick.p.edgecombe@xxxxxxxxx>
Feel free to add:
Acked-by: Kai Huang <kai.huang@xxxxxxxxx>
Btw, there's a functional change here, and perhaps we should call out in
changelog:
- Currently tdx_cpu_flush_cache_for_kexec() is done in
kvm_disable_virtualization_cpu(), which is also called by KVM's CPUHP
offline() callback. So tdx_cpu_flush_cache_for_kexec() is explicitly done
in TDX code in CPU offline.
- With this change, tdx_cpu_flush_cache_for_kexec() is not explicitly done
in TDX code in CPU offline.
But AFAICT this is fine, since IIUC the WBINVD is always done when kernel
offlines one CPU (see [*]), i.e., the current
tdx_cpu_flush_cache_for_kexec() done in KVM's CPUHP is actually superfluous.
[*] See:
native_play_dead() ->
cpuidle_play_dead();
hlt_play_dead();
cpuidle_play_dead() can invoke different enter_dead() callbacks depending on
what idle driver is being used, but AFAICT eventually it ends up calling
either acpi_idle_play_dead() or mwait_play_dead(), both of which does WBINVD
before going to idle.
If cpuidle_play_dead() doesn't idle successfully, the hlt_play_dead() will
then WBINVD and hlt.
Actually, after looking at multiple commits around here, e.g.,
ea53069231f93 ("x86, hotplug: Use mwait to offline a processor, fix the
legacy case")
dfbba2518aac4 ("Revert "ACPI: processor: idle: Only flush cache on
entering C3")
... I believe it's a kernel policy to make sure cache is flushed when it
offlines a CPU (which makes sense anyway of course), I just couldn't find
the exact commit saying this (or I am not sure whether there's such commit).
Btw2, kinda related to this, could you help review:
https://lore.kernel.org/lkml/20260302102226.7459-1-kai.huang@xxxxxxxxx/