Re: [PATCH 4/4] KVM: x86: forcibly leave nested mode on vCPU reset

From: Maxim Levitsky
Date: Tue Oct 25 2022 - 09:37:53 EST


On Thu, 2022-10-20 at 15:33 +0000, Sean Christopherson wrote:
> On Thu, Oct 20, 2022, Maxim Levitsky wrote:
> > While not obivous, kvm_vcpu_reset leaves the nested mode by
>
> Please add () when referencing function, and wrap closer to ~75 chars.
>
> > clearing 'vcpu->arch.hflags' but it does so without all the
> > required housekeeping.
> >
> > This makes SVM and VMX continue to use vmcs02/vmcb02 while
>
> This bug should be impossible to hit on VMX as INIT and TRIPLE_FAULT unconditionally
> cause VM-Exit, i.e. will always be forwarded to L1.

True I guess as I found out as well, in VMX the physical CPU can't be reset while
in guest mode. I'll update the changelog.

>
> > the cpu is not in nested mode.
>
> Can you add a blurb to call out exactly how this bug can be triggered?  Doesn't
> take much effort to suss out the "how", but it'd be nice to capture that info in
> the changelog.

I will add (in another patch) a selftest for this.

>
> > In particular, in SVM code, it makes the 'svm_free_nested'
> > free the vmcb02, while still in use, which later triggers
> > use after free and a kernel crash.
> >
> > This issue is assigned CVE-2022-3344
> >
> > Cc: stable@xxxxxxxxxxxxxxx
> > Signed-off-by: Maxim Levitsky <mlevitsk@xxxxxxxxxx>
> > ---
> >  arch/x86/kvm/x86.c | 1 +
> >  1 file changed, 1 insertion(+)
> >
> > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > index d86a8aae1471d3..313c4a6dc65e45 100644
> > --- a/arch/x86/kvm/x86.c
> > +++ b/arch/x86/kvm/x86.c
> > @@ -11931,6 +11931,7 @@ void kvm_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event)
> >         WARN_ON_ONCE(!init_event &&
> >                      (old_cr0 || kvm_read_cr3(vcpu) || kvm_read_cr4(vcpu)));
> >  
> > +       kvm_leave_nested(vcpu);
>
> Not a big deal, especially if/when nested_ops are turned into static_calls, but
> at the same time it's quite easy to do:
>
>         if (is_guest_mode(vcpu))
>                 kvm_leave_nested(vcpu);
>
> I think it's worth adding a comment explaining how this can happen, and to also
> call out that EFER is cleared on INIT, i.e. that virtualization is disabled due
> to EFER.SVME=0.  Unsurprisingly, I don't see anything in the APM that explicitly
> states what happens if INIT occurs in guest mode, i.e. it's not immediately obvious
> that forcing the vCPU back to L1 is architecturally correct.
>
>
> >         kvm_lapic_reset(vcpu, init_event);
> >  
> >         vcpu->arch.hflags = 0;
>
> Maybe add a WARN above this to try and detect other potential issues?  Kinda silly,
> but it'd at least help draw attention to the importance of hflags.
>
> E.g. this?
>
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 4bd5f8a751de..c50fa0751a0b 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -11915,6 +11915,15 @@ void kvm_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event)
>         unsigned long old_cr0 = kvm_read_cr0(vcpu);
>         unsigned long new_cr0;
>  
> +       /*
> +        * SVM doesn't unconditionally VM-Exit on INIT and SHUTDOWN, thus it's
> +        * possible to INIT the vCPU while L2 is active.  Force the vCPU back
> +        * into L1 as EFER.SVME is cleared on INIT (along with all other EFER
> +        * bits), i.e. virtualization is disabled.
> +        */


> +       if (is_guest_mode(vcpu))
> +               kvm_leave_nested(vcpu);
> +
>         /*
>          * Several of the "set" flows, e.g. ->set_cr0(), read other registers
>          * to handle side effects.  RESET emulation hits those flows and relies
> @@ -11927,6 +11936,7 @@ void kvm_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event)
>  
>         kvm_lapic_reset(vcpu, init_event);
>  
> +       WARN_ON_ONCE(is_guest_mode(vcpu) || is_smm(vcpu));
>         vcpu->arch.hflags = 0;
>  
>         vcpu->arch.smi_pending = 0;
>

Best regards,
Maxim Levitsky