Re: [PATCH v4 11/24] x86/virt/seamldr: Introduce skeleton for TDX Module updates

From: Edgecombe, Rick P

Date: Wed Mar 11 2026 - 22:00:49 EST


On Thu, 2026-02-12 at 06:35 -0800, Chao Gao wrote:
> TDX Module updates require careful synchronization with other TDX
> operations on the host. During updates, only update-related SEAMCALLs are
> permitted; all other SEAMCALLs must be blocked.
>
> However, SEAMCALLs can be invoked from different contexts (normal and IRQ
> context) and run in parallel across CPUs. And, all TD vCPUs must remain
> out of guest mode during updates.
>

Above it says only update-related SEAMCALLs are permitted. Does that not already
exclude SEAMCALLs that might allow entering the TD?

> No single lock primitive can satisfy
> all these synchronization requirements, so stop_machine() is used as the
> only well-understood mechanism that can meet them all.
>
> The TDX Module update process consists of several steps as described in
> Intel® Trust Domain Extensions (Intel® TDX) Module Base Architecture
> Specification, Revision 348549-007, Chapter 4.5 "TD-Preserving TDX Module
> Update"
>
> - shut down the old module
> - install the new module
> - global and per-CPU initialization
> - restore state information
>
> Some steps must execute on a single CPU, others must run serially across
> all CPUs, and some can run concurrently on all CPUs. There are also
> ordering requirements between steps, so all CPUs must work in a step-locked
> manner.

Does the fact that they can run on other CPUs add any synchronization
requirements? If not I'd leave it off.

>
> In summary, TDX Module updates create two requirements:

The stop_machine() part seems more like a solution then a requirement.

>
> 1. The entire update process must use stop_machine() to synchronize with
> other TDX workloads
> 2. Update steps must be performed in a step-locked manner
>
> To prepare for implementing concrete TDX Module update steps, establish
> the framework by mimicking multi_cpu_stop(), which is a good example of
> performing a multi-step task in step-locked manner.
>

Offline Chao pointed that Paul suggested this after considering refactoring out
the common code. I think it might still be worth mentioning why you can't use
multi_cpu_stop() directly. I guess there are some differences. what are they.

> Specifically, use a
> global state machine to control each CPU's work and require all CPUs to
> acknowledge completion before proceeding to the next step.

Maybe add a bit more about the reasoning for requiring the other steps to ack.
Tie it back to the lockstep part.

>
> Potential alternative to stop_machine()
> =======================================
> An alternative approach is to lock all KVM entry points and kick all
> vCPUs. Here, KVM entry points refer to KVM VM/vCPU ioctl entry points,
> implemented in KVM common code (virt/kvm). Adding a locking mechanism
> there would affect all architectures KVM supports. And to lock only TDX
> vCPUs, new logic would be needed to identify TDX vCPUs, which the KVM
> common code currently lacks. This would add significant complexity and
> maintenance overhead to KVM for this TDX-specific use case.
>
> Signed-off-by: Chao Gao <chao.gao@xxxxxxxxx>
> Reviewed-by: Xu Yilun <yilun.xu@xxxxxxxxxxxxxxx>
> Reviewed-by: Tony Lindgren <tony.lindgren@xxxxxxxxxxxxxxx>
> ---
> v2:
> - refine the changlog to follow context-problem-solution structure
> - move alternative discussions at the end of the changelog
> - add a comment about state machine transition
> - Move rcu_momentary_eqs() call to the else branch.
> ---
> arch/x86/virt/vmx/tdx/seamldr.c | 70 ++++++++++++++++++++++++++++++++-
> 1 file changed, 69 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/virt/vmx/tdx/seamldr.c b/arch/x86/virt/vmx/tdx/seamldr.c
> index 718cb8396057..21d572d75769 100644
> --- a/arch/x86/virt/vmx/tdx/seamldr.c
> +++ b/arch/x86/virt/vmx/tdx/seamldr.c
> @@ -10,8 +10,10 @@
> #include <linux/cpuhplock.h>
> #include <linux/cpumask.h>
> #include <linux/mm.h>
> +#include <linux/nmi.h>
> #include <linux/slab.h>
> #include <linux/spinlock.h>
> +#include <linux/stop_machine.h>
>
> #include <asm/seamldr.h>
>
> @@ -186,6 +188,68 @@ static struct seamldr_params *init_seamldr_params(const u8 *data, u32 size)
> return alloc_seamldr_params(module, module_size, sig, sig_size);
> }
>
> +/*
> + * During a TDX Module update, all CPUs start from TDP_START and progress
> + * to TDP_DONE. Each state is associated with certain work. For some
> + * states, just one CPU needs to perform the work, while other CPUs just
> + * wait during those states.
> + */
> +enum tdp_state {
> + TDP_START,
> + TDP_DONE,
> +};
> +
> +static struct {
> + enum tdp_state state;
> + atomic_t thread_ack;
> +} tdp_data;
> +
> +static void set_target_state(enum tdp_state state)
> +{
> + /* Reset ack counter. */
> + atomic_set(&tdp_data.thread_ack, num_online_cpus());
> + /* Ensure thread_ack is updated before the new state */
> + smp_wmb();
> + WRITE_ONCE(tdp_data.state, state);
> +}
> +
> +/* Last one to ack a state moves to the next state. */
> +static void ack_state(void)
> +{
> + if (atomic_dec_and_test(&tdp_data.thread_ack))
> + set_target_state(tdp_data.state + 1);
> +}
> +
> +/*
> + * See multi_cpu_stop() from where this multi-cpu state-machine was
> + * adopted, and the rationale for touch_nmi_watchdog()
> + */
> +static int do_seamldr_install_module(void *params)
> +{
> + enum tdp_state newstate, curstate = TDP_START;
> + int ret = 0;
> +
> + do {
> + /* Chill out and re-read tdp_data */
> + cpu_relax();
> + newstate = READ_ONCE(tdp_data.state);
> +
> + if (newstate != curstate) {
> + curstate = newstate;
> + switch (curstate) {

Maybe a little comment here like "todo add the steps".

> + default:
> + break;
> + }
> + ack_state();
> + } else {
> + touch_nmi_watchdog();
> + rcu_momentary_eqs();
> + }
> + } while (curstate != TDP_DONE);
> +
> + return ret;
> +}
> +
> DEFINE_FREE(free_seamldr_params, struct seamldr_params *,
> if (!IS_ERR_OR_NULL(_T)) free_seamldr_params(_T))
>
> @@ -223,7 +287,11 @@ int seamldr_install_module(const u8 *data, u32 size)
> return -EBUSY;
> }
>
> - /* TODO: Update TDX Module here */
> + set_target_state(TDP_START + 1);
> + ret = stop_machine_cpuslocked(do_seamldr_install_module, params, cpu_online_mask);
> + if (ret)
> + return ret;
> +
> return 0;
> }
> EXPORT_SYMBOL_FOR_MODULES(seamldr_install_module, "tdx-host");