Re: [PATCH 2/3] KVM: x86: guest debug: don't inject interrupts while single stepping

From: Sean Christopherson
Date: Tue Mar 16 2021 - 12:51:36 EST


On Tue, Mar 16, 2021, Maxim Levitsky wrote:
> On Tue, 2021-03-16 at 16:31 +0100, Jan Kiszka wrote:
> > Back then, when I was hacking on the gdb-stub and KVM support, the
> > monitor trap flag was not yet broadly available, but the idea to once
> > use it was already there. Now it can be considered broadly available,
> > but it would still require some changes to get it in.
> >
> > Unfortunately, we don't have such thing with SVM, even recent versions,
> > right? So, a proper way of avoiding diverting event injections while we
> > are having the guest in an "incorrect" state should definitely be the goal.
> Yes, I am not aware of anything like monitor trap on SVM.
>
> >
> > Given that KVM knows whether TF originates solely from guest debugging
> > or was (also) injected by the guest, we should be able to identify the
> > cases where your approach is best to apply. And that without any extra
> > control knob that everyone will only forget to set.
> Well I think that the downside of this patch is that the user might actually
> want to single step into an interrupt handler, and this patch makes it a bit
> more complicated, and changes the default behavior.

Yes. And, as is, this also blocks NMIs and SMIs. I suspect it also doesn't
prevent weirdness if the guest is running in L2, since IRQs for L1 will cause
exits from L2 during nested_ops->check_events().

> I have no objections though to use this patch as is, or at least make this
> the new default with a new flag to override this.

That's less bad, but IMO still violates the principle of least surprise, e.g.
someone that is single-stepping a guest and is expecting an IRQ to fire will be
all kinds of confused if they see all the proper IRR, ISR, EFLAGS.IF, etc...
settings, but no interrupt.

> Sean Christopherson, what do you think?

Rather than block all events in KVM, what about having QEMU "pause" the timer?
E.g. save MSR_TSC_DEADLINE and APIC_TMICT (or inspect the guest to find out
which flavor it's using), clear them to zero, then restore both when
single-stepping is disabled. I think that will work?