Re: [PATCH v2 0/2] kvm: x86: Convey the exit reason to user-space on emulation failure

From: David Matlack
Date: Thu Jul 08 2021 - 14:38:27 EST


On Thu, Jul 08, 2021 at 03:17:40PM +0100, David Edmondson wrote:
> Apologies if you see two of these - I had some email problems earlier.

I only got one! :)

>
> On Wednesday, 2021-07-07 at 23:20:04 UTC, David Matlack wrote:
>
> > On Tue, Jul 06, 2021 at 11:12:05AM +0100, David Edmondson wrote:
> >> To help when debugging failures in the field, if instruction emulation
> >> fails, report the VM exit reason to userspace in order that it can be
> >> recorded.
> >
> > What is the benefit of seeing the VM-exit reason that led to an
> > emulation failure?
>
> I can't cite an example of where this has definitively led in a
> direction that helped solve a problem, but we do sometimes see emulation
> failures reported in situations where we are not able to reproduce the
> failures on demand and the existing information provided at the time of
> failure is either insufficient or suspect.
>
> Given that, I'm left casting about for data that can be made available
> to assist in postmortem analysis of the failures.

Understood, thanks for the context. My only concern would be that
userspace APIs are difficult to change once they exist. If it turns
out knowing the exit reason does not help with debugging emulation
failures we'd still be stuck with exporting it on every emulation
failure.

My intuition is that the instruction bytes (which are now available with
Aaron's patch) and the guest register state (which is queryable through
other ioctls) should be sufficient to set up a reproduction of the
emulation failure in a kvm-unit-test and the exit reason should not
really matter. I'm curious if that's not the case?

I'm really not opposed to exporting the exit reason if it is useful, I'm
just not sure it will help.

>
> >> I'm unsure whether sgx_handle_emulation_failure() needs to be adapted
> >> to use the emulation_failure part of the exit union in struct kvm_run
> >> - advice welcomed.
> >>
> >> v2:
> >> - Improve patch comments (dmatlack)
> >> - Intel should provide the full exit reason (dmatlack)
> >> - Pass a boolean rather than flags (dmatlack)
> >> - Use the helper in kvm_task_switch() and kvm_handle_memory_failure()
> >> (dmatlack)
> >> - Describe the exit_reason field of the emulation_failure structure
> >> (dmatlack)
> >>
> >> David Edmondson (2):
> >> KVM: x86: Add kvm_x86_ops.get_exit_reason
> >> KVM: x86: On emulation failure, convey the exit reason to userspace
> >>
> >> arch/x86/include/asm/kvm-x86-ops.h | 1 +
> >> arch/x86/include/asm/kvm_host.h | 3 +++
> >> arch/x86/kvm/svm/svm.c | 6 ++++++
> >> arch/x86/kvm/vmx/vmx.c | 11 +++++++----
> >> arch/x86/kvm/x86.c | 22 +++++++++++++---------
> >> include/uapi/linux/kvm.h | 7 +++++++
> >> 6 files changed, 37 insertions(+), 13 deletions(-)
> >>
> >> --
> >> 2.30.2
> >>
>
> dme.
> --
> It's gettin', it's gettin', it's gettin' kinda hectic.