Re: [PATCH v2 5/6] KVM: x86: Track available/dirty register masks as "unsigned long" values

From: Sean Christopherson

Date: Mon Apr 13 2026 - 11:01:40 EST


On Mon, Apr 13, 2026, Kai Huang wrote:
> On Thu, 2026-04-09 at 15:42 -0700, Sean Christopherson wrote:
> > -#define TDX_REGS_AVAIL_SET (BIT_ULL(VCPU_REG_EXIT_INFO_1) | \
> > - BIT_ULL(VCPU_REG_EXIT_INFO_2) | \
> > - BIT_ULL(VCPU_REGS_RAX) | \
> > - BIT_ULL(VCPU_REGS_RBX) | \
> > - BIT_ULL(VCPU_REGS_RCX) | \
> > - BIT_ULL(VCPU_REGS_RDX) | \
> > - BIT_ULL(VCPU_REGS_RBP) | \
> > - BIT_ULL(VCPU_REGS_RSI) | \
> > - BIT_ULL(VCPU_REGS_RDI) | \
> > - BIT_ULL(VCPU_REGS_R8) | \
> > - BIT_ULL(VCPU_REGS_R9) | \
> > - BIT_ULL(VCPU_REGS_R10) | \
> > - BIT_ULL(VCPU_REGS_R11) | \
> > - BIT_ULL(VCPU_REGS_R12) | \
> > - BIT_ULL(VCPU_REGS_R13) | \
> > - BIT_ULL(VCPU_REGS_R14) | \
> > - BIT_ULL(VCPU_REGS_R15))
> > +#define TDX_REGS_AVAIL_SET (BIT(VCPU_REG_EXIT_INFO_1) | \
> > + BIT(VCPU_REG_EXIT_INFO_2) | \
> > + BIT(VCPU_REGS_RAX) | \
> > + BIT(VCPU_REGS_RBX) | \
> > + BIT(VCPU_REGS_RCX) | \
> > + BIT(VCPU_REGS_RDX) | \
> > + BIT(VCPU_REGS_RBP) | \
> > + BIT(VCPU_REGS_RSI) | \
> > + BIT(VCPU_REGS_RDI) | \
> > + BIT(VCPU_REGS_R8) | \
> > + BIT(VCPU_REGS_R9) | \
> > + BIT(VCPU_REGS_R10) | \
> > + BIT(VCPU_REGS_R11) | \
> > + BIT(VCPU_REGS_R12) | \
> > + BIT(VCPU_REGS_R13) | \
> > + BIT(VCPU_REGS_R14) | \
> > + BIT(VCPU_REGS_R15))
> >  
>
> Not related to this series, but this made me look into whether these
> registers are truly needed to be set as available for TDX.
>
> Firstly, all the listed registers are marked as available immediately after
> exiting from tdh_vp_enter(), but except VCPU_REG_EXIT_INFO_1 and
> VCPU_REG_EXIT_INFO_2 are immediately saved to the common 'struct vcpu_vt',
> all other GPRs are not saved to vcpu->arch.regs[], which means marking GPRs
> available immediately doesn't quite make sense.
>
> In fact, IIUC other than when the TD exits with TDVMCALL on which TD shares
> couple of GPRs with KVM, KVM has no way to get TD's GPRs. So perhaps it
> makes more sense is to mark the shared GPRs available upon TDVMCALL.
>
> But even that does not make sense from KVM's "GPR available" perspective,
> because TDVMCALL has a different ABI from KVM's existing infrastructure for
> e.g., CPUID/MSR emulation. E.g., KVM uses RCX/RAX/RDX for MSR emulation,
> but TDVMCALL<MSR.WRITE> uses R12 and R13 to convey MSR index/value:
>
> case EXIT_REASON_MSR_WRITE:
> kvm_rcx_write(vcpu, tdx->vp_enter_args.r12);
> kvm_rax_write(vcpu, tdx->vp_enter_args.r13 & -1u);
> kvm_rdx_write(vcpu, tdx->vp_enter_args.r13 >> 32);
>
> So I think the most accurate way is to explicitly mark the relevant GPRs
> available for each type of TDVMCALL. I am not sure whether it's worth to do
> though, because AFAICT there's no real bug in the existing code, other than
> "marking GPRs not in vcpu->arch.regs[] as available looks wrong".
>
> A less invasive way is to mark all possible GPRs that can be used in
> TDVMCALL emulation available once after TD exits. AFAICT the KVM hypercall
> uses most GPRs (RAX/RBX/RCX/RDX/RSI) and all other TDVMCALLs only use a
> subset, so maybe we can remove other GPRs from the available list (the diff
> in [*] passed my test of booting/destroying TD).
>
> Bug again, not sure whether it's worth doing.

Not worth doing. Because VMX and SVM make all GRPs available immediately, except
for RSP, KVM ignores avail/dirty for GPRs. I.e. "fixing" TDX will just shift the
"bugs" elsewhere.

More importantly, because the TDX-Module *requires* RCX (the GPR that holds the
mask of registers to expose to the VMM) to be hidden on TDVMCALL, KVM *can't*
do any kind of meaningful "available" tracking. Versus sev_es_validate_vmgexit(),
which can at least sanity check that the registers needed to service a hypercall
have valid data.

So unfortunately, since we need to rely on testing to verify KVM's implementation
no matter what, I don't think it'd be a net positive to overhaul KVM's handling
of GPRs to support SEV-ES+'s and TDX's "sometimes available" GPR set.