Re: [PATCH v19 079/130] KVM: TDX: vcpu_run: save/restore host state(host kernel gs)

From: Isaku Yamahata
Date: Fri Apr 12 2024 - 16:17:12 EST


On Sun, Apr 07, 2024 at 11:02:52AM +0800,
Binbin Wu <binbin.wu@xxxxxxxxxxxxxxx> wrote:

> > diff --git a/arch/x86/kvm/vmx/main.c b/arch/x86/kvm/vmx/main.c
> > index d72651ce99ac..8275a242ce07 100644
> > --- a/arch/x86/kvm/vmx/main.c
> > +++ b/arch/x86/kvm/vmx/main.c
> > @@ -158,6 +158,32 @@ static void vt_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event)
> > vmx_vcpu_reset(vcpu, init_event);
> > }
> > +static void vt_prepare_switch_to_guest(struct kvm_vcpu *vcpu)
> > +{
> > + /*
> > + * All host state is saved/restored across SEAMCALL/SEAMRET,
>
> It sounds confusing to me.
> If all host states are saved/restored across SEAMCALL/SEAMRET, why this
> patch saves/restores MSR_KERNEL_GS_BASE for host?
>

No. Probably we should update the comment. Something like
restored => restored or initialized to reset state.

Except conditionally saved/restored MSRs (e.g., perfrmon, debugreg), IA32_START,
IA32_LSTART, MSR_SYSCALL_MASK, IA32_TSC_AUX and TA32_KERNEL_GS_BASE are reset to
initial state. uret handles the first four. The kernel_gs_base needs to be
restored on TDExit.

> > and the
> > + * guest state of a TD is obviously off limits. Deferring MSRs and DRs
> > + * is pointless because the TDX module needs to load *something* so as
> > + * not to expose guest state.
> > + */
> > + if (is_td_vcpu(vcpu)) {
> > + tdx_prepare_switch_to_guest(vcpu);
> > + return;
> > + }
> > +
> > + vmx_prepare_switch_to_guest(vcpu);
> > +}
> > +
> > +static void vt_vcpu_put(struct kvm_vcpu *vcpu)
> > +{
> > + if (is_td_vcpu(vcpu)) {
> > + tdx_vcpu_put(vcpu);
> > + return;
> > + }
> > +
> > + vmx_vcpu_put(vcpu);
> > +}
> > +
> > static int vt_vcpu_pre_run(struct kvm_vcpu *vcpu)
> > {
> > if (is_td_vcpu(vcpu))
> > @@ -326,9 +352,9 @@ struct kvm_x86_ops vt_x86_ops __initdata = {
> > .vcpu_free = vt_vcpu_free,
> > .vcpu_reset = vt_vcpu_reset,
> > - .prepare_switch_to_guest = vmx_prepare_switch_to_guest,
> > + .prepare_switch_to_guest = vt_prepare_switch_to_guest,
> > .vcpu_load = vmx_vcpu_load,
> > - .vcpu_put = vmx_vcpu_put,
> > + .vcpu_put = vt_vcpu_put,
> > .update_exception_bitmap = vmx_update_exception_bitmap,
> > .get_msr_feature = vmx_get_msr_feature,
> > diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c
> > index fdf9196cb592..9616b1aab6ce 100644
> > --- a/arch/x86/kvm/vmx/tdx.c
> > +++ b/arch/x86/kvm/vmx/tdx.c
> > @@ -1,5 +1,6 @@
> > // SPDX-License-Identifier: GPL-2.0
> > #include <linux/cpu.h>
> > +#include <linux/mmu_context.h>
> > #include <asm/tdx.h>
> > @@ -423,6 +424,7 @@ u8 tdx_get_mt_mask(struct kvm_vcpu *vcpu, gfn_t gfn, bool is_mmio)
> > int tdx_vcpu_create(struct kvm_vcpu *vcpu)
> > {
> > struct kvm_tdx *kvm_tdx = to_kvm_tdx(vcpu->kvm);
> > + struct vcpu_tdx *tdx = to_tdx(vcpu);
> > WARN_ON_ONCE(vcpu->arch.cpuid_entries);
> > WARN_ON_ONCE(vcpu->arch.cpuid_nent);
> > @@ -446,9 +448,47 @@ int tdx_vcpu_create(struct kvm_vcpu *vcpu)
> > if ((kvm_tdx->xfam & XFEATURE_MASK_XTILE) == XFEATURE_MASK_XTILE)
> > vcpu->arch.xfd_no_write_intercept = true;
> > + tdx->host_state_need_save = true;
> > + tdx->host_state_need_restore = false;
> > +
> > return 0;
> > }
> > +void tdx_prepare_switch_to_guest(struct kvm_vcpu *vcpu)
>
> Just like vmx_prepare_switch_to_host(), the input can be "struct vcpu_tdx
> *", since vcpu is not used inside the function.
> And the callsites just use "to_tdx(vcpu)"
>
> > +{
> > + struct vcpu_tdx *tdx = to_tdx(vcpu);
> Then, this can be dropped.

prepare_switch_to_guest() is used for kvm_x86_ops.prepare_switch_to_guest().
kvm_x86_ops consistently takes struct kvm_vcpu.
--
Isaku Yamahata <isaku.yamahata@xxxxxxxxx>