Re: [PATCH 4/4] KVM: x86: forcibly leave nested mode on vCPU reset
From: Maxim Levitsky
Date: Tue Oct 25 2022 - 09:37:53 EST
On Thu, 2022-10-20 at 15:33 +0000, Sean Christopherson wrote:
> On Thu, Oct 20, 2022, Maxim Levitsky wrote:
> > While not obivous, kvm_vcpu_reset leaves the nested mode by
>
> Please add () when referencing function, and wrap closer to ~75 chars.
>
> > clearing 'vcpu->arch.hflags' but it does so without all the
> > required housekeeping.
> >
> > This makes SVM and VMX continue to use vmcs02/vmcb02 while
>
> This bug should be impossible to hit on VMX as INIT and TRIPLE_FAULT unconditionally
> cause VM-Exit, i.e. will always be forwarded to L1.
True I guess as I found out as well, in VMX the physical CPU can't be reset while
in guest mode. I'll update the changelog.
>
> > the cpu is not in nested mode.
>
> Can you add a blurb to call out exactly how this bug can be triggered? Doesn't
> take much effort to suss out the "how", but it'd be nice to capture that info in
> the changelog.
I will add (in another patch) a selftest for this.
>
> > In particular, in SVM code, it makes the 'svm_free_nested'
> > free the vmcb02, while still in use, which later triggers
> > use after free and a kernel crash.
> >
> > This issue is assigned CVE-2022-3344
> >
> > Cc: stable@xxxxxxxxxxxxxxx
> > Signed-off-by: Maxim Levitsky <mlevitsk@xxxxxxxxxx>
> > ---
> > arch/x86/kvm/x86.c | 1 +
> > 1 file changed, 1 insertion(+)
> >
> > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > index d86a8aae1471d3..313c4a6dc65e45 100644
> > --- a/arch/x86/kvm/x86.c
> > +++ b/arch/x86/kvm/x86.c
> > @@ -11931,6 +11931,7 @@ void kvm_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event)
> > WARN_ON_ONCE(!init_event &&
> > (old_cr0 || kvm_read_cr3(vcpu) || kvm_read_cr4(vcpu)));
> >
> > + kvm_leave_nested(vcpu);
>
> Not a big deal, especially if/when nested_ops are turned into static_calls, but
> at the same time it's quite easy to do:
>
> if (is_guest_mode(vcpu))
> kvm_leave_nested(vcpu);
>
> I think it's worth adding a comment explaining how this can happen, and to also
> call out that EFER is cleared on INIT, i.e. that virtualization is disabled due
> to EFER.SVME=0. Unsurprisingly, I don't see anything in the APM that explicitly
> states what happens if INIT occurs in guest mode, i.e. it's not immediately obvious
> that forcing the vCPU back to L1 is architecturally correct.
>
>
> > kvm_lapic_reset(vcpu, init_event);
> >
> > vcpu->arch.hflags = 0;
>
> Maybe add a WARN above this to try and detect other potential issues? Kinda silly,
> but it'd at least help draw attention to the importance of hflags.
>
> E.g. this?
>
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 4bd5f8a751de..c50fa0751a0b 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -11915,6 +11915,15 @@ void kvm_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event)
> unsigned long old_cr0 = kvm_read_cr0(vcpu);
> unsigned long new_cr0;
>
> + /*
> + * SVM doesn't unconditionally VM-Exit on INIT and SHUTDOWN, thus it's
> + * possible to INIT the vCPU while L2 is active. Force the vCPU back
> + * into L1 as EFER.SVME is cleared on INIT (along with all other EFER
> + * bits), i.e. virtualization is disabled.
> + */
> + if (is_guest_mode(vcpu))
> + kvm_leave_nested(vcpu);
> +
> /*
> * Several of the "set" flows, e.g. ->set_cr0(), read other registers
> * to handle side effects. RESET emulation hits those flows and relies
> @@ -11927,6 +11936,7 @@ void kvm_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event)
>
> kvm_lapic_reset(vcpu, init_event);
>
> + WARN_ON_ONCE(is_guest_mode(vcpu) || is_smm(vcpu));
> vcpu->arch.hflags = 0;
>
> vcpu->arch.smi_pending = 0;
>
Best regards,
Maxim Levitsky