Re: [PATCH 1/1] x86/tdx: Route safe halt execution via tdx_safe_halt

From: Dave Hansen
Date: Tue Jan 28 2025 - 17:08:23 EST


On 1/28/25 13:36, Vishal Annapurve wrote:
> Direct HLT instruction execution causes #VEs for TDX VMs which is routed
> to hypervisor via tdvmcall. This process renders HLT instruction
> execution inatomic, so any preceeding instructions like STI/MOV SS will
> end up enabling interrupts before the HLT instruction is routed to the
> hypervisor. This creates scenarios where interrupts could land during
> HLT instruction emulation without aborting halt operation leading to
> idefinite halt wait times.

Could you please break out the spell checker before posting v2? There
are a couple problems in that paragraph.

> x86_idle is already upgraded to invoke tdx_safe_halt to avoid such

Please add parenthesis to functions() to make it more clear what you are
referring to.

> scenarios, but it didn't cover pvnative_safe_halt which can be invoked
> using raw_safe_halt from call sites like acpi_safe_halt (acpi_pm
> subsystem). This patch upgrades the safe_halt executions to use
> tdx_safe_halt.

No "this patch", please.

> To avoid future call sites which cause HLT instruction emulation with
> irqs enabled, add a warn and fail the HLT instruction emulation.

This seems like a bug fix. Shouldn't it have a cc:stable@ and a Fixes: tag?

Do you have any thoughts on why nobody has hit this up to now? Are TDX
users not enabling PARAVIRT_XXL? Not using ACPI?

> diff --git a/arch/x86/coco/tdx/tdx.c b/arch/x86/coco/tdx/tdx.c
> index 0d9b090b4880..98b5f317596d 100644
> --- a/arch/x86/coco/tdx/tdx.c
> +++ b/arch/x86/coco/tdx/tdx.c
> @@ -14,6 +14,7 @@
> #include <asm/ia32.h>
> #include <asm/insn.h>
> #include <asm/insn-eval.h>
> +#include <asm/paravirt_types.h>
> #include <asm/pgtable.h>
> #include <asm/set_memory.h>
> #include <asm/traps.h>
> @@ -380,6 +381,11 @@ static int handle_halt(struct ve_info *ve)
> {
> const bool irq_disabled = irqs_disabled();
>
> + if (!irq_disabled) {
> + WARN(1, "HLT instruction emulation unsafe with irqs enabled\n");
> + return -EIO;
> + }

Yeah, this warning is a good idea. But probably best left as a
WARN_ONCE() so it doesn't spew too badly.

> @@ -1083,6 +1089,15 @@ void __init tdx_early_init(void)
> x86_platform.guest.enc_kexec_begin = tdx_kexec_begin;
> x86_platform.guest.enc_kexec_finish = tdx_kexec_finish;
>
> +#ifdef CONFIG_PARAVIRT_XXL
> + /*
> + * halt instruction execution is not atomic for TDX VMs as it generates
> + * #VEs, so otherwise "safe" halt invocations which cause interrupts to
> + * get enabled right after halt instruction don't work for TDX VMs.
> + */
> + pv_ops.irq.safe_halt = tdx_safe_halt;
> +#endif
The basic bug here was that there was a path to a hlt instruction that
folks didn't realize. This patch fixes the basic bug and gives us a nice
warning if there are additional paths that weren't imagined.

But it doesn't really help us audit the code to make it clear that TDX
guest kernel's _can't_ screw up hlt again the same way. This, for
instance would make it pretty clear:

static __always_inline void native_safe_halt(void)
{
if (cpu_feature_enabled(X86_FEATURE_TDX_GUEST))
tdx_safe_halt();
mds_idle_clear_cpu_buffers();
asm volatile("sti; hlt": : :"memory");
}

There are reasons we wouldn't want to do that exactly, but I'd much
prefer something that is harder to screw up than the proposal above.
Anybody have any better ideas?