Re: [PATCH 4/7] x86/kexec: Disable kexec/kdump on platforms with TDX partial write erratum

From: Huang, Kai
Date: Wed Oct 01 2025 - 17:19:38 EST


On Wed, 2025-10-01 at 11:00 -0700, Hansen, Dave wrote:
> On 10/1/25 10:17, Vishal Annapurve wrote:
> > And also mentions:
> > "Also note only the normal kexec needs to worry about this problem, but
> > not the crash kexec: 1) The kdump kernel only uses the special memory
> > reserved by the first kernel, and the reserved memory can never be used
> > by TDX in the first kernel; 2) The /proc/vmcore, which reflects the
> > first (crashed) kernel's memory, is only for read. The read will never
> > "poison" TDX memory thus cause unexpected machine check (only partial
> > write does)."
> >
> > What was the scenario that led to disabling kdump support altogether
> > given the above description?
>
> I think it was purely out of convenience so that the disabling could be
> three lines of code.
>
> I don't know off the top of my head if there's a simple enough way to
> disable kexec but not kdump. When I applied the thing, I was probably
> just considering kexec/kdump a monolithic thing and not thinking that
> folks would want one but not the other.
>
> Kai, did you have any other motivations?

The "/proc/vmcore is only for read" is my understanding of how the kdump
kernel uses the /proc/vmcore. I used to only disable kexec but allow
kdump to work (something like the diff below [*]), but during the internal
review we decided to just disable all since we cannot be sure whether it
is 100% true for all the kdump users.

This was raised by Vishal publicly before and was discussed here (in v3):

https://lore.kernel.org/kvm/f8dcbe257b3931aec9e199132b678bd7681b7efa.camel@xxxxxxxxx/

[*]:

diff --git a/arch/x86/kernel/machine_kexec_64.c
b/arch/x86/kernel/machine_kexec_64.c
index 15088d14904f..c7af4aa7dd6b 100644
--- a/arch/x86/kernel/machine_kexec_64.c
+++ b/arch/x86/kernel/machine_kexec_64.c
@@ -356,10 +356,11 @@ int machine_kexec_prepare(struct kimage *image)
* On those platforms the old kernel must reset TDX private
* memory before jumping to the new kernel otherwise the new
* kernel may see unexpected machine check. For simplicity
- * just fail kexec/kdump on those platforms.
+ * just fail kexec on those platforms. Still allow kdump since
+ * the kdump kernel will only reads TDX memory but not write.
*/
- if (boot_cpu_has_bug(X86_BUG_TDX_PW_MCE)) {
- pr_info_once("Not allowed on platform with tdx_pw_mce
bug\n");
+ if (boot_cpu_has_bug(X86_BUG_TDX_PW_MCE) && image->type !=
KEXEC_TYPE_CRASH) {
+ pr_info_once("Kexec not allowed on platform with
tdx_pw_mce bug\n");
return -EOPNOTSUPP;
}