Re: [PATCH v2 5/6] KVM: x86: Track available/dirty register masks as "unsigned long" values

From: Huang, Kai

Date: Mon Apr 13 2026 - 19:06:25 EST


On Mon, 2026-04-13 at 07:54 -0700, Sean Christopherson wrote:
> On Mon, Apr 13, 2026, Kai Huang wrote:
> > On Thu, 2026-04-09 at 15:42 -0700, Sean Christopherson wrote:
> > > -#define TDX_REGS_AVAIL_SET (BIT_ULL(VCPU_REG_EXIT_INFO_1) | \
> > > - BIT_ULL(VCPU_REG_EXIT_INFO_2) | \
> > > - BIT_ULL(VCPU_REGS_RAX) | \
> > > - BIT_ULL(VCPU_REGS_RBX) | \
> > > - BIT_ULL(VCPU_REGS_RCX) | \
> > > - BIT_ULL(VCPU_REGS_RDX) | \
> > > - BIT_ULL(VCPU_REGS_RBP) | \
> > > - BIT_ULL(VCPU_REGS_RSI) | \
> > > - BIT_ULL(VCPU_REGS_RDI) | \
> > > - BIT_ULL(VCPU_REGS_R8) | \
> > > - BIT_ULL(VCPU_REGS_R9) | \
> > > - BIT_ULL(VCPU_REGS_R10) | \
> > > - BIT_ULL(VCPU_REGS_R11) | \
> > > - BIT_ULL(VCPU_REGS_R12) | \
> > > - BIT_ULL(VCPU_REGS_R13) | \
> > > - BIT_ULL(VCPU_REGS_R14) | \
> > > - BIT_ULL(VCPU_REGS_R15))
> > > +#define TDX_REGS_AVAIL_SET (BIT(VCPU_REG_EXIT_INFO_1) | \
> > > + BIT(VCPU_REG_EXIT_INFO_2) | \
> > > + BIT(VCPU_REGS_RAX) | \
> > > + BIT(VCPU_REGS_RBX) | \
> > > + BIT(VCPU_REGS_RCX) | \
> > > + BIT(VCPU_REGS_RDX) | \
> > > + BIT(VCPU_REGS_RBP) | \
> > > + BIT(VCPU_REGS_RSI) | \
> > > + BIT(VCPU_REGS_RDI) | \
> > > + BIT(VCPU_REGS_R8) | \
> > > + BIT(VCPU_REGS_R9) | \
> > > + BIT(VCPU_REGS_R10) | \
> > > + BIT(VCPU_REGS_R11) | \
> > > + BIT(VCPU_REGS_R12) | \
> > > + BIT(VCPU_REGS_R13) | \
> > > + BIT(VCPU_REGS_R14) | \
> > > + BIT(VCPU_REGS_R15))
> > >  
> >
> > Not related to this series, but this made me look into whether these
> > registers are truly needed to be set as available for TDX.
> >
> > Firstly, all the listed registers are marked as available immediately after
> > exiting from tdh_vp_enter(), but except VCPU_REG_EXIT_INFO_1 and
> > VCPU_REG_EXIT_INFO_2 are immediately saved to the common 'struct vcpu_vt',
> > all other GPRs are not saved to vcpu->arch.regs[], which means marking GPRs
> > available immediately doesn't quite make sense.
> >
> > In fact, IIUC other than when the TD exits with TDVMCALL on which TD shares
> > couple of GPRs with KVM, KVM has no way to get TD's GPRs. So perhaps it
> > makes more sense is to mark the shared GPRs available upon TDVMCALL.
> >
> > But even that does not make sense from KVM's "GPR available" perspective,
> > because TDVMCALL has a different ABI from KVM's existing infrastructure for
> > e.g., CPUID/MSR emulation. E.g., KVM uses RCX/RAX/RDX for MSR emulation,
> > but TDVMCALL<MSR.WRITE> uses R12 and R13 to convey MSR index/value:
> >
> > case EXIT_REASON_MSR_WRITE:
> > kvm_rcx_write(vcpu, tdx->vp_enter_args.r12);
> > kvm_rax_write(vcpu, tdx->vp_enter_args.r13 & -1u);
> > kvm_rdx_write(vcpu, tdx->vp_enter_args.r13 >> 32);
> >
> > So I think the most accurate way is to explicitly mark the relevant GPRs
> > available for each type of TDVMCALL. I am not sure whether it's worth to do
> > though, because AFAICT there's no real bug in the existing code, other than
> > "marking GPRs not in vcpu->arch.regs[] as available looks wrong".
> >
> > A less invasive way is to mark all possible GPRs that can be used in
> > TDVMCALL emulation available once after TD exits. AFAICT the KVM hypercall
> > uses most GPRs (RAX/RBX/RCX/RDX/RSI) and all other TDVMCALLs only use a
> > subset, so maybe we can remove other GPRs from the available list (the diff
> > in [*] passed my test of booting/destroying TD).
> >
> > Bug again, not sure whether it's worth doing.
>
> Not worth doing.  
>

Fine to me. :-)

> Because VMX and SVM make all GRPs available immediately, except
> for RSP, KVM ignores avail/dirty for GPRs. I.e. "fixing" TDX will just shift the
> "bugs" elsewhere.

Just want to understand:

I thought the fix could be we simply remove the wrong GPRs from the list.
Not sure how fixing TDX will shift bugs elsewhere?

>
> More importantly, because the TDX-Module *requires* RCX (the GPR that holds the
> mask of registers to expose to the VMM) to be hidden on TDVMCALL, KVM *can't*
> do any kind of meaningful "available" tracking.  
>

Hmm I think RCX conveys the shared GPRs and VMM can read. Per "Table 5.323:
TDH.VP.ENTER Output Operands Format #5 Definition: On TDCALL(TDG.VP.VMCALL)
Following a TD Entry":

RCX ...
Bit(s) Name Description

31:0 PARAMS_MASK Value as passed into TDCALL(TDG.VP.VMCALL) by
the guest TD: indicates which part of the guest
TD GPR and XMM state is passed as-is to the
VMM 
and back. For details, see the description of
TDG.VP.VMCALL in 5.5.26.

I think the problem is, as said previously, currently KVM TDX code uses
KVM's existing infrastructure to emulate MSR, KVM hypercall etc, but
TDVMCALL has a different ABI, thus there's a mismatch here.