Re: [PATCH 1/1] x86/tdx: Route safe halt execution via tdx_safe_halt

From: Dave Hansen
Date: Tue Jan 28 2025 - 17:08:23 EST

Next message: Frank Li: "[PATCH v9 0/7] PCI: dwc: opitimaze RC Host/EP pci_fixup_addr()"
Previous message: Jonathan Corbet: "Re: [PATCH] .mailmap: Update incorrect entry and my email in MAINTAINERS"
In reply to: Vishal Annapurve: "[PATCH 1/1] x86/tdx: Route safe halt execution via tdx_safe_halt"
Next in thread: Vishal Annapurve: "Re: [PATCH 1/1] x86/tdx: Route safe halt execution via tdx_safe_halt"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On 1/28/25 13:36, Vishal Annapurve wrote:
> Direct HLT instruction execution causes #VEs for TDX VMs which is routed
> to hypervisor via tdvmcall. This process renders HLT instruction
> execution inatomic, so any preceeding instructions like STI/MOV SS will
> end up enabling interrupts before the HLT instruction is routed to the
> hypervisor. This creates scenarios where interrupts could land during
> HLT instruction emulation without aborting halt operation leading to
> idefinite halt wait times.

Could you please break out the spell checker before posting v2? There
are a couple problems in that paragraph.

> x86_idle is already upgraded to invoke tdx_safe_halt to avoid such

Please add parenthesis to functions() to make it more clear what you are
referring to.

> scenarios, but it didn't cover pvnative_safe_halt which can be invoked
> using raw_safe_halt from call sites like acpi_safe_halt (acpi_pm
> subsystem). This patch upgrades the safe_halt executions to use
> tdx_safe_halt.

No "this patch", please.

> To avoid future call sites which cause HLT instruction emulation with
> irqs enabled, add a warn and fail the HLT instruction emulation.

This seems like a bug fix. Shouldn't it have a cc:stable@ and a Fixes: tag?

Do you have any thoughts on why nobody has hit this up to now? Are TDX
users not enabling PARAVIRT_XXL? Not using ACPI?

> diff --git a/arch/x86/coco/tdx/tdx.c b/arch/x86/coco/tdx/tdx.c
> index 0d9b090b4880..98b5f317596d 100644
> --- a/arch/x86/coco/tdx/tdx.c
> +++ b/arch/x86/coco/tdx/tdx.c
> @@ -14,6 +14,7 @@
> #include <asm/ia32.h>
> #include <asm/insn.h>
> #include <asm/insn-eval.h>
> +#include <asm/paravirt_types.h>
> #include <asm/pgtable.h>
> #include <asm/set_memory.h>
> #include <asm/traps.h>
> @@ -380,6 +381,11 @@ static int handle_halt(struct ve_info *ve)
> {
> const bool irq_disabled = irqs_disabled();
>
> + if (!irq_disabled) {
> + WARN(1, "HLT instruction emulation unsafe with irqs enabled\n");
> + return -EIO;
> + }

Yeah, this warning is a good idea. But probably best left as a
WARN_ONCE() so it doesn't spew too badly.

> @@ -1083,6 +1089,15 @@ void __init tdx_early_init(void)
> x86_platform.guest.enc_kexec_begin = tdx_kexec_begin;
> x86_platform.guest.enc_kexec_finish = tdx_kexec_finish;
>
> +#ifdef CONFIG_PARAVIRT_XXL
> + /*
> + * halt instruction execution is not atomic for TDX VMs as it generates
> + * #VEs, so otherwise "safe" halt invocations which cause interrupts to
> + * get enabled right after halt instruction don't work for TDX VMs.
> + */
> + pv_ops.irq.safe_halt = tdx_safe_halt;
> +#endif
The basic bug here was that there was a path to a hlt instruction that
folks didn't realize. This patch fixes the basic bug and gives us a nice
warning if there are additional paths that weren't imagined.

But it doesn't really help us audit the code to make it clear that TDX
guest kernel's _can't_ screw up hlt again the same way. This, for
instance would make it pretty clear:

static __always_inline void native_safe_halt(void)
{
if (cpu_feature_enabled(X86_FEATURE_TDX_GUEST))
tdx_safe_halt();
mds_idle_clear_cpu_buffers();
asm volatile("sti; hlt": : :"memory");
}

There are reasons we wouldn't want to do that exactly, but I'd much
prefer something that is harder to screw up than the proposal above.
Anybody have any better ideas?

Next message: Frank Li: "[PATCH v9 0/7] PCI: dwc: opitimaze RC Host/EP pci_fixup_addr()"
Previous message: Jonathan Corbet: "Re: [PATCH] .mailmap: Update incorrect entry and my email in MAINTAINERS"
In reply to: Vishal Annapurve: "[PATCH 1/1] x86/tdx: Route safe halt execution via tdx_safe_halt"
Next in thread: Vishal Annapurve: "Re: [PATCH 1/1] x86/tdx: Route safe halt execution via tdx_safe_halt"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]