Re: [PATCH 3/3] KVM: x86: Check for injected exceptions before queuing a debug exception

From: Sean Christopherson

Date: Mon Mar 02 2026 - 18:50:37 EST


On Mon, Mar 02, 2026, Yosry Ahmed wrote:
> On Mon, Mar 2, 2026 at 3:22 PM Sean Christopherson <seanjc@xxxxxxxxxx> wrote:
> >
> > On Fri, Feb 27, 2026, Yosry Ahmed wrote:
> > > > > That being said, I hate nested_run_in_progress. It's too close to
> > > > > nested_run_pending and I am pretty sure they will be mixed up.
> > > >
> > > > Agreed, though the fact that name is _too_ close means that, aside from the
> > > > potential for disaster (minor detail), it's accurate.
> > > >
> > > > One thought is to hide nested_run_in_progress beyond a KConfig, so that attempts
> > > > to use it for anything but the sanity check(s) would fail the build. I don't
> > > > really want to create yet another KVM_PROVE_xxx though, but unlike KVM_PROVE_MMU,
> > > > I think we want to this enabled in production.
> > > >
> > > > I'll chew on this a bit...
> > >
> > > Maybe (if we go this direction) name it very explicitly
> > > warn_on_nested_exception if it's only intended to be used for the
> > > sanity checks?
> >
> > It's not just about exceptions though. That's the case that has caused a rash
> > of recent problems, but the rule isn't specific to exceptions, it's very broadly
> > Thou Shalt Not Cancel VMRUN.
> >
> > I think that's where there's some disconnect. We can't make the nested_run_pending
> > warnings go away by adding more sanity checks, and I am dead set against removing
> > those warnings.
> >
> > Aha! Idea. What if we turn nested_run_pending into a u8, and use a magic value
> > of '2' to indicate that userspace gained control of the CPU since nested_run_pending
> > was set, and then only WARN on nested_run_pending==1? That way we don't have to
> > come up with a new name, and there's zero chance of nested_run_pending and something
> > like nested_run_in_progress getting out of sync.
>
> Yeah this should work, the only thing I would change is using macros
> instead of 1 and 2 for readability.

I was "this" close to using a enum or #define, but I couldn't figure out a clean
solution to this code:

vcpu->arch.nested_run_pending =
!!(kvm_state->flags & KVM_STATE_NESTED_RUN_PENDING);

as I didn't want to end up with effectively:

if (true)
x = 1;
else
x = 0;

But thinking more on it, that code is inherently untrusuted, so it can be this:

if (kvm_state->flags & KVM_STATE_NESTED_RUN_PENDING)
vcpu->arch.nested_run_pending = KVM_NESTED_RUN_PENDING_UNTRUSTED;
else
vcpu->arch.nested_run_pending = 0;

which is pretty much the same, but at least is a bit more than a convoluted cast
from a bool to an int.

FWIW, I verified this makes the C reproducer happy.