Re: [PATCH v19 106/130] KVM: TDX: Add KVM Exit for TDX TDG.VP.VMCALL

From: Isaku Yamahata
Date: Thu Apr 04 2024 - 19:02:57 EST


On Wed, Apr 03, 2024 at 08:58:50AM -0700,
Sean Christopherson <seanjc@xxxxxxxxxx> wrote:

> On Mon, Feb 26, 2024, isaku.yamahata@xxxxxxxxx wrote:
> > From: Isaku Yamahata <isaku.yamahata@xxxxxxxxx>
> >
> > Some of TDG.VP.VMCALL require device model, for example, qemu, to handle
> > them on behalf of kvm kernel module. TDVMCALL_REPORT_FATAL_ERROR,
> > TDVMCALL_MAP_GPA, TDVMCALL_SETUP_EVENT_NOTIFY_INTERRUPT, and
> > TDVMCALL_GET_QUOTE requires user space VMM handling.
> >
> > Introduce new kvm exit, KVM_EXIT_TDX, and functions to setup it. Device
> > model should update R10 if necessary as return value.
>
> Hard NAK.
>
> KVM needs its own ABI, under no circumstance should KVM inherit ABI directly from
> the GHCI. Even worse, this doesn't even sanity check the "unknown" VMCALLs, KVM
> just blindly punts *everything* to userspace. And even worse than that, KVM
> already has at least one user exit that overlaps, TDVMCALL_MAP_GPA => KVM_HC_MAP_GPA_RANGE.
>
> If the userspace VMM wants to run an end-around on KVM and directly communicate
> with the guest, e.g. via a synthetic device (a la virtio), that's totally fine,
> because *KVM* is not definining any unique ABI, KVM is purely providing the
> transport, e.g. emulated MMIO or PIO (and maybe not even that). IIRC, this option
> even came up in the context of GET_QUOTE.
>
> But explicit exiting to userspace with KVM_EXIT_TDX is very different. KVM is
> creating a contract with userspace that says "for TDX VMCALLs [a-z], KVM will exit
> to userspace with values [a-z]". *Every* new VMCALL that's added to the GHCI will
> become KVM ABI, e.g. if Intel ships a TDX module that adds a new VMALL, then KVM
> will forward the exit to userspace, and userspace can then start relying on that
> behavior.
>
> And punting all register state, decoding, etc. to userspace creates a crap ABI.
> KVM effectively did this for SEV and SEV-ES by copying the PSP ABI verbatim into
> KVM ioctls(), and it's a gross, ugly mess.
>
> Each VMCALL that KVM wants to forward needs a dedicated KVM_EXIT_<reason> and
> associated struct in the exit union. Yes, it's slightly more work now, but it's
> one time pain. Whereas copying all registers is endless misery for everyone
> involved, e.g. *every* userspace VMM needs to decipher the registers, do sanity
> checking, etc. And *every* end user needs to do the same when a debugging
> inevitable failures.
>
> This also solves Chao's comment about XMM registers. Except for emualting Hyper-V
> hypercalls, which have very explicit handling, KVM does NOT support using XMM
> registers in hypercalls.

Sure. I will introduce the followings.

KVM_EXIT_TDX_GET_QUOTE
Request a quote.

KVM_EXIT_TDX_SETUP_EVENT_NOTIFY_INTERRUPT
Guest tells which interrupt vector the VMM uses to notify the guest.
The use case if GetQuote. It is async request. The user-space VMM uses
this interrupts to notify the guest on the completion. Or guest polls it.

KVM_EXIT_TDX_REPORT_FATAL_ERROR
Guest panicked. This conveys extra 64 bytes in registers. Probably this should
be converted to KVM_EXIT_SYSTEM_EVENT and KVM_SYSTEM_EVENT_CRASH.

MapGPA is converted to KVM_HC_MAP_GPA_RANGE.

--
Isaku Yamahata <isaku.yamahata@xxxxxxxxx>