Re: [PATCH 35/43] KVM: x86: Move setting of sregs during vCPU RESET/INIT to common x86

From: Reiji Watanabe
Date: Fri May 21 2021 - 01:20:13 EST


On Tue, May 18, 2021 at 2:45 PM Sean Christopherson <seanjc@xxxxxxxxxx> wrote:
>
> On Mon, May 17, 2021, Reiji Watanabe wrote:
> > > --- a/arch/x86/kvm/svm/svm.c
> > > +++ b/arch/x86/kvm/svm/svm.c
> > > @@ -1204,12 +1204,6 @@ static void init_vmcb(struct kvm_vcpu *vcpu)
> > > init_sys_seg(&save->ldtr, SEG_TYPE_LDT);
> > > init_sys_seg(&save->tr, SEG_TYPE_BUSY_TSS16);
> > >
> > > - svm_set_cr0(vcpu, X86_CR0_NW | X86_CR0_CD | X86_CR0_ET);
> > > - svm_set_cr4(vcpu, 0);
> > > - svm_set_efer(vcpu, 0);
> > > - kvm_set_rflags(vcpu, X86_EFLAGS_FIXED);
> > > - vcpu->arch.regs[VCPU_REGS_RIP] = 0x0000fff0;
> >
> > Reviewed-by: Reiji Watanabe <reijiw@xxxxxxxxxx>
> >
> > Those your vCPU RESET/INIT changes look great.
> >
> > I think the change in init_vmcb() basically assumes that the
> > function is called from kvm_vcpu_reset(via svm_vcpu_reset()).
> > Although shutdown_interception() directly calls init_mcb(),
> > I would think the change doesn't matter for the shutdown
> > interception case.
> >
> > IMHO it would be a bit misleading that a function named 'init_vmcb',
> > which is called from other than kvm_vcpu_reset (svm_vcpu_reset()),
> > only partially resets the vmcb (probably just to me though).
>
> It's not just you, that code is funky. If I could go back in time, I would lobby
> to not automatically init the VMCB and instead put the vCPU into
> KVM_MP_STATE_UNINITIALIZED and force userspace to explicitly INIT or RESET the
> vCPU. Even better would be to add KVM_MP_STATE_SHUTDOWN, since technically NMI
> can break shutdown (and SMI on Intel CPUs).

I see. Adding KVM_MP_STATE_SHUTDOWN sounds right to me
given the real CPU's behavior.

> Anyways, that ship has sailed, but we might be able to get away with replacing
> init_vmcb() with kvm_vcpu_reset(vcpu, true), i.e. effecting a full INIT. That
> would ensure the VMCB is consistent with KVM's software model, which I'm guessing
> is not true with the direct init_vmcb() call. It would also have some connection
> to reality since there exist bare metal platforms that automatically INIT the CPU
> if it hits shutdown (maybe only for the BSP?).

Yes, calling kvm_vcpu_reset(vcpu, true) sounds better than
the direct init_vmcb() call.


> Side topic, the NMI thing got me looking through init_vmcb() to see how it
> handles the IDT and GDT, and surprise, surprise, it fails to zero IDTR.base and
> GDTR.base. I'll add a patch to fix that, and maybe try to consolidate the VMX
> and SVM segmentation logic.

That's surprising...
It seems init_vmcb() was used only for RESET when the function was
originally introduced and the entire vmcb was zero-cleared before
init_vmcb() was called. So, I would suspect init_vmcb() was originally
implemented assuming that all the vmcb fields were zero.

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=6aa8b732ca01c3d7a54e93f4d701b8aabbe60fb7

Thanks,
Reiji