Re: [PATCH V2 05/12] KVM: TDX: Implement TDX vcpu enter/exit path
From: Adrian Hunter
Date: Thu Mar 06 2025 - 14:13:52 EST
On 6/03/25 20:19, Paolo Bonzini wrote:
> On 2/27/25 19:37, Adrian Hunter wrote:
>> On 25/02/25 08:15, Xiaoyao Li wrote:
>>> On 2/24/2025 8:27 PM, Adrian Hunter wrote:
>>>> On 20/02/25 15:16, Xiaoyao Li wrote:
>>>>> On 1/29/2025 5:58 PM, Adrian Hunter wrote:
>>>>>> +#define TDX_REGS_UNSUPPORTED_SET (BIT(VCPU_EXREG_RFLAGS) | \
>>>>>> + BIT(VCPU_EXREG_SEGMENTS))
>>>>>> +
>>>>>> +fastpath_t tdx_vcpu_run(struct kvm_vcpu *vcpu, bool force_immediate_exit)
>>>>>> +{
>>>>>> + /*
>>>>>> + * force_immediate_exit requires vCPU entering for events injection with
>>>>>> + * an immediately exit followed. But The TDX module doesn't guarantee
>>>>>> + * entry, it's already possible for KVM to_think_ it completely entry
>>>>>> + * to the guest without actually having done so.
>>>>>> + * Since KVM never needs to force an immediate exit for TDX, and can't
>>>>>> + * do direct injection, just warn on force_immediate_exit.
>>>>>> + */
>>>>>> + WARN_ON_ONCE(force_immediate_exit);
>>>>>> +
>>>>>> + trace_kvm_entry(vcpu, force_immediate_exit);
>>>>>> +
>>>>>> + tdx_vcpu_enter_exit(vcpu);
>>>>>> +
>>>>>> + vcpu->arch.regs_avail &= ~TDX_REGS_UNSUPPORTED_SET;
>>>>>
>>>>> I don't understand this. Why only clear RFLAGS and SEGMENTS?
>>>>>
>>>>> When creating the vcpu, vcpu->arch.regs_avail = ~0 in kvm_arch_vcpu_create().
>>>>>
>>>>> now it only clears RFLAGS and SEGMENTS for TDX vcpu, which leaves other bits set. But I don't see any code that syncs the guest value of into vcpu->arch.regs[reg].
>>>>
>>>> TDX guest registers are generally not known but
>>>> values are placed into vcpu->arch.regs when needed
>>>> to work with common code.
>>>>
>>>> We used to use ~VMX_REGS_LAZY_LOAD_SET and tdx_cache_reg()
>>>> which has since been removed.
>>>>
>>>> tdx_cache_reg() did not support RFLAGS, SEGMENTS,
>>>> EXIT_INFO_1/EXIT_INFO_2 but EXIT_INFO_1/EXIT_INFO_2 became
>>>> needed, so that just left RFLAGS, SEGMENTS.
>>>
>>> Quote what Sean said [1]
>>>
>>> “I'm also not convinced letting KVM read garbage for RIP, RSP, CR3, or
>>> PDPTRs is at all reasonable. CR3 and PDPTRs should be unreachable,
>>> and I gotta imagine the same holds true for RSP. Allow reads/writes
>>> to RIP is fine, in that it probably simplifies the overall code.”
>>>
>>> We need to justify why to let KVM read "garbage" of VCPU_REGS_RIP,
>>> VCPU_EXREG_PDPTR, VCPU_EXREG_CR0, VCPU_EXREG_CR3, VCPU_EXREG_CR4,
>>> VCPU_EXREG_EXIT_INFO_1, and VCPU_EXREG_EXIT_INFO_2 are neeed.
>>>
>>> The changelog justify nothing for it.
>>
>> Could add VCPU_REGS_RIP, VCPU_REGS_RSP, VCPU_EXREG_CR3, VCPU_EXREG_PDPTR.
>> But not VCPU_EXREG_CR0 nor VCPU_EXREG_CR4 since we started using them.
>
> Hi Adrian,
>
> how is CR0 used? And CR4 is only used other than for loading the XSAVE state, I think?
I meant it is used in the sense that patch "[PATCH V2 07/12] KVM: TDX:
restore host xsave state when exit from the guest TD" provides a value for it.
But it looks like it might be accessible via:
store_regs()
__get_sregs()
__get_sregs_common()
Sean wanted a maximal CR0 value consistent with the CR4.
CR4 is also being used in kvm_update_cpuid_runtime().
>
> I will change this to a list of specific available registers instead of using "&= ~", and it would be even better if CR0/CR4 are not on the list.
>
> Paolo
>
>>> btw, how EXIT_INFO_1/EXIT_INFO_2 became needed? It seems I cannot find any TDX code use them.
>>
>> vmx_get_exit_qual() / vmx_get_intr_info() are now used by TDX.
>>
>>>
>>> [1] https://lore.kernel.org/all/Z2GiQS_RmYeHU09L@xxxxxxxxxx/
>>>
>>>>>
>>>>>> + trace_kvm_exit(vcpu, KVM_ISA_VMX);
>>>>>> +
>>>>>> + return EXIT_FASTPATH_NONE;
>>>>>> +}
>>>>>
>>>>
>>>
>>
>>
>