Re: [PATCH v8 16/21] x86/virt/tdx: Reject updates during concurrent TD build
From: Dave Hansen
Date: Thu Apr 30 2026 - 15:26:11 EST
On 4/27/26 08:28, Chao Gao wrote:
> tl;dr: A TDX module erratum can silently corrupt TD measurement state if a
> module update races with TD build. Handle that by rejecting the update,
> instead of introducing new TD-build ioctl failure paths.
The downside of this needs to be discussed. Namely that module updates
can be blocked forever.
> Long Version:
...
This explanation is confusing.
Focus on what the patch *does* and its features and downsides.
*Then* broach the alternatives. But, please, clearly separate out this
patch from other opining.
> diff --git a/arch/x86/include/asm/tdx.h b/arch/x86/include/asm/tdx.h
> index de822ed9ef0b..b063aabe2554 100644
> --- a/arch/x86/include/asm/tdx.h
> +++ b/arch/x86/include/asm/tdx.h
> @@ -26,11 +26,18 @@
> #define TDX_SEAMCALL_GP (TDX_SW_ERROR | X86_TRAP_GP)
> #define TDX_SEAMCALL_UD (TDX_SW_ERROR | X86_TRAP_UD)
>
> +#define TDX_SEAMCALL_STATUS_MASK 0xFFFFFFFF00000000ULL
> +
> /*
> * TDX module SEAMCALL leaf function error codes
> */
> -#define TDX_SUCCESS 0ULL
> -#define TDX_RND_NO_ENTROPY 0x8000020300000000ULL
> +#define TDX_SUCCESS 0ULL
> +#define TDX_RND_NO_ENTROPY 0x8000020300000000ULL
> +#define TDX_UPDATE_COMPAT_SENSITIVE 0x8000051200000000ULL
> +
> +/* Bit definitions of TDX_FEATURES0 metadata field */
> +#define TDX_FEATURES0_NO_RBP_MOD BIT_ULL(18)
> +#define TDX_FEATURES0_UPDATE_COMPAT BIT_ULL(47)
Refactor first. Add new features second.
> #ifndef __ASSEMBLER__
>
> diff --git a/arch/x86/kvm/vmx/tdx_errno.h b/arch/x86/kvm/vmx/tdx_errno.h
> index 6ff4672c4181..215c00d76a94 100644
> --- a/arch/x86/kvm/vmx/tdx_errno.h
> +++ b/arch/x86/kvm/vmx/tdx_errno.h
> @@ -4,8 +4,6 @@
> #ifndef __KVM_X86_TDX_ERRNO_H
> #define __KVM_X86_TDX_ERRNO_H
>
> -#define TDX_SEAMCALL_STATUS_MASK 0xFFFFFFFF00000000ULL
> -
> /*
> * TDX SEAMCALL Status Codes (returned in RAX)
> */
> diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c
> index a7dfa4ee8813..7864ab68f4e3 100644
> --- a/arch/x86/virt/vmx/tdx/tdx.c
> +++ b/arch/x86/virt/vmx/tdx/tdx.c
> @@ -1234,10 +1234,13 @@ static __init int tdx_enable(void)
> }
> subsys_initcall(tdx_enable);
>
> +#define TDX_SYS_SHUTDOWN_AVOID_COMPAT_SENSITIVE BIT(16)
> +
> int tdx_module_shutdown(void)
> {
> struct tdx_sys_info_handoff handoff = {};
> struct tdx_module_args args = {};
> + u64 err;
> int ret, cpu;
>
> ret = get_tdx_sys_info_handoff(&handoff);
> @@ -1248,9 +1251,26 @@ int tdx_module_shutdown(void)
> * module can produce and most likely supported by newer modules.
> */
> args.rcx = handoff.module_hv;
> - ret = seamcall_prerr(TDH_SYS_SHUTDOWN, &args);
> - if (ret)
> - return ret;
> +
> + /*
> + * Mitigate the erratum where updates can break concurrent TD
> + * build. Do not pre-check support for this flag. If unsupported,
> + * rely on the TDX module to reject shutdown requests.
> + */
> + args.rcx |= TDX_SYS_SHUTDOWN_AVOID_COMPAT_SENSITIVE;
"Mitigate the erratum..." is a strange way to start this.
This would be a much better format I think:
/*
* This flag will <say what it does> if <triggering event>
* happens. That eliminates exposure to a TDX erratum which
* can <explain bad things here>.
*
* This flag is not supported by all TDX modules and may cause
* the shutdown (and subsequent update procedure) to fail.
*/
> + err = seamcall(TDH_SYS_SHUTDOWN, &args);
> +
> + /*
> + * Return -EBUSY to signal that some ongoing flows are incompatible
> + * with updates so that userspace can retry.
> + */
/*
* The shutdown ran into a "sensitive" ongoing operation, like
* TD build. Signal to userspace that it can retry.
*/
> + if ((err & TDX_SEAMCALL_STATUS_MASK) == TDX_UPDATE_COMPAT_SENSITIVE)
> + return -EBUSY;
> + if (err) {
> + seamcall_err(TDH_SYS_SHUTDOWN, err, &args);
> + return -EIO;
> + }
Whitespace between the if()s please.