Re: [PATCH v2 1/2] KVM: SVM: Triple fault L1 on unintercepted EFER.SVME clear by L2

From: Yosry Ahmed

Date: Fri Feb 27 2026 - 15:04:10 EST


> > > @@ -216,6 +216,17 @@ int svm_set_efer(struct kvm_vcpu *vcpu, u64 efer)
> > >
> > > if ((old_efer & EFER_SVME) != (efer & EFER_SVME)) {
> > > if (!(efer & EFER_SVME)) {
> > > + /*
> > > + * Architecturally, clearing EFER.SVME while a guest is
> > > + * running yields undefined behavior, i.e. KVM can do
> > > + * literally anything. Force the vCPU back into L1 as
> > > + * that is the safest option for KVM, but synthesize a
> > > + * triple fault (for L1!) so that KVM at least doesn't
> > > + * run random L2 code in the context of L1.
> > > + */
> > > + if (is_guest_mode(vcpu))
> > > + kvm_make_request(KVM_REQ_TRIPLE_FAULT, vcpu);
> > > +
> >
> > Sigh, I think this is not correct in all cases:
> >
> > 1. If userspace restores a vCPU with EFER.SVME=0 to a vCPU with
> > EFER.SVME=1 (e.g. restoring a vCPU running to a vCPU running L2).
> > Typically KVM_SET_SREGS is done before KVM_SET_NESTED_STATE, so we may
> > set EFER.SVME = 0 before leaving guest mode.
> >
> > 2. On vCPU reset, we clear EFER. Hmm, this one is seemingly okay tho,
> > looking at kvm_vcpu_reset(), we leave nested first:
> >
> > /*
> > * SVM doesn't unconditionally VM-Exit on INIT and SHUTDOWN, thus it's
> > * possible to INIT the vCPU while L2 is active. Force the vCPU back
> > * into L1 as EFER.SVME is cleared on INIT (along with all other EFER
> > * bits), i.e. virtualization is disabled.
> > */
> > if (is_guest_mode(vcpu))
> > kvm_leave_nested(vcpu);
> >
> > ...
> >
> > kvm_x86_call(set_efer)(vcpu, 0);
> >
> > So I think the only problematic case is (1). We can probably fix this by
> > plumbing host_initiated through set_efer? This is getting more
> > complicated than I would have liked..
>
> What if we instead hook WRMSR interception? A little fugly (well, more than a
> little), but I think it would minimize the chances of a false-positive. The
> biggest potential flaw I see is that this will incorrectly triple fault if KVM
> synthesizes a #VMEXIT while emulating the WRMSR. But that really shouldn't
> happen, because even a #GP=>#VMEXIT needs to be queued but not synthesized until
> the emulation sequence completes (any other behavior would risk confusing KVM).

What if we key off vcpu->wants_to_run?

It's less protection against false positives from things like
kvm_vcpu_reset() if it didn't leave nested before clearing EFER, but
more protection against the #VMEXIT case you mentioned. Also should be
much lower on the fugliness scale imo.