Re: [PATCH 4/7] x86/kexec: Disable kexec/kdump on platforms with TDX partial write erratum
From: Vishal Annapurve
Date: Sun Oct 26 2025 - 19:34:03 EST
On Mon, Sep 1, 2025 at 9:11 AM Paolo Bonzini <pbonzini@xxxxxxxxxx> wrote:
>
> From: Kai Huang <kai.huang@xxxxxxxxx>
>
> Some early TDX-capable platforms have an erratum: A kernel partial
> write (a write transaction of less than cacheline lands at memory
> controller) to TDX private memory poisons that memory, and a subsequent
> read triggers a machine check.
>
> On those platforms, the old kernel must reset TDX private memory before
> jumping to the new kernel, otherwise the new kernel may see unexpected
> machine check. Currently the kernel doesn't track which page is a TDX
> private page. For simplicity just fail kexec/kdump for those platforms.
>
> Leverage the existing machine_kexec_prepare() to fail kexec/kdump by
> adding the check of the presence of the TDX erratum (which is only
> checked for if the kernel is built with TDX host support). This rejects
> kexec/kdump when the kernel is loading the kexec/kdump kernel image.
>
> The alternative is to reject kexec/kdump when the kernel is jumping to
> the new kernel. But for kexec this requires adding a new check (e.g.,
> arch_kexec_allowed()) in the common code to fail kernel_kexec() at early
> stage. Kdump (crash_kexec()) needs similar check, but it's hard to
> justify because crash_kexec() is not supposed to abort.
>
> It's feasible to further relax this limitation, i.e., only fail kexec
> when TDX is actually enabled by the kernel. But this is still a half
> measure compared to resetting TDX private memory so just do the simplest
> thing for now.
Hi Kai,
IIUC, kernel doesn't donate any of it's available memory to TDX module
if TDX is not actually enabled (i.e. if "kvm.intel.tdx=y" kernel
command line parameter is missing).
Why is it unsafe to allow kexec/kdump if "kvm.intel.tdx=y" is not
supplied to the kernel?
>
> The impact to userspace is the users will get an error when loading the
> kexec/kdump kernel image:
>
> kexec_load failed: Operation not supported
>
> This might be confusing to the users, thus also print the reason in the
> dmesg:
>