Re: [PATCH 4/7] KVM: TDX: restore host xsave state when exit from the guest TD

From: Adrian Hunter
Date: Mon Nov 25 2024 - 06:10:56 EST


On 22/11/24 07:49, Chao Gao wrote:
>> +static void tdx_restore_host_xsave_state(struct kvm_vcpu *vcpu)
>> +{
>> + struct kvm_tdx *kvm_tdx = to_kvm_tdx(vcpu->kvm);
>> +
>> + if (static_cpu_has(X86_FEATURE_XSAVE) &&
>> + kvm_host.xcr0 != (kvm_tdx->xfam & kvm_caps.supported_xcr0))
>> + xsetbv(XCR_XFEATURE_ENABLED_MASK, kvm_host.xcr0);
>> + if (static_cpu_has(X86_FEATURE_XSAVES) &&
>> + /* PT can be exposed to TD guest regardless of KVM's XSS support */
>> + kvm_host.xss != (kvm_tdx->xfam &
>> + (kvm_caps.supported_xss | XFEATURE_MASK_PT |
>> + XFEATURE_MASK_CET_USER | XFEATURE_MASK_CET_KERNEL)))
>
> Should we drop CET/PT from this series? I think they are worth a new
> patch/series.

This is not really about CET/PT

What is happening here is that we are calculating the current
MSR_IA32_XSS value based on the TDX Module spec which says the
TDX Module sets MSR_IA32_XSS to the XSS bits from XFAM. The
TDX Module does that literally, from TDX Module source code:

#define XCR0_SUPERVISOR_BIT_MASK 0x0001FD00
and
ia32_wrmsr(IA32_XSS_MSR_ADDR, xfam & XCR0_SUPERVISOR_BIT_MASK);

For KVM, rather than:

kvm_tdx->xfam &
(kvm_caps.supported_xss | XFEATURE_MASK_PT |
XFEATURE_MASK_CET_USER | XFEATURE_MASK_CET_KERNEL)

it would be more direct to define the bits and enforce them
via tdx_get_supported_xfam() e.g.

/*
* Before returning from TDH.VP.ENTER, the TDX Module assigns:
* XCR0 to the TD’s user-mode feature bits of XFAM (bits 7:0, 9)
* IA32_XSS to the TD's supervisor-mode feature bits of XFAM (bits 8, 16:10)
*/
#define TDX_XFAM_XCR0_MASK (GENMASK(7, 0) | BIT(9))
#define TDX_XFAM_XSS_MASK (GENMASK(16, 10) | BIT(8))
#define TDX_XFAM_MASK (TDX_XFAM_XCR0_MASK | TDX_XFAM_XSS_MASK)

static u64 tdx_get_supported_xfam(const struct tdx_sys_info_td_conf *td_conf)
{
u64 val = kvm_caps.supported_xcr0 | kvm_caps.supported_xss;

/* Ensure features are in the masks */
val &= TDX_XFAM_MASK;

if ((val & td_conf->xfam_fixed1) != td_conf->xfam_fixed1)
return 0;

val &= td_conf->xfam_fixed0;

return val;
}

and then:

if (static_cpu_has(X86_FEATURE_XSAVE) &&
kvm_host.xcr0 != (kvm_tdx->xfam & TDX_XFAM_XCR0_MASK))
xsetbv(XCR_XFEATURE_ENABLED_MASK, kvm_host.xcr0);
if (static_cpu_has(X86_FEATURE_XSAVES) &&
kvm_host.xss != (kvm_tdx->xfam & TDX_XFAM_XSS_MASK))
wrmsrl(MSR_IA32_XSS, kvm_host.xss);

>
>> + wrmsrl(MSR_IA32_XSS, kvm_host.xss);
>> + if (static_cpu_has(X86_FEATURE_PKU) &&
>
> How about using cpu_feature_enabled()? It is used in kvm_load_host_xsave_state()
> It handles the case where CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS is not
> enabled.
>
>> + (kvm_tdx->xfam & XFEATURE_MASK_PKRU))
>> + write_pkru(vcpu->arch.host_pkru);
>
> If host_pkru happens to match the hardware value after TD-exits, the write can
> be omitted, similar to what is done above for xss and xcr0.

True. It might be better to make restoring PKRU a separate
patch so that the commit message can explain why it needs to
be done here.