Re: [PATCH] KVM: SVM: Do not terminate SEV-ES guests on GHCB validation failure

From: Sean Christopherson
Date: Thu May 20 2021 - 15:16:55 EST


On Mon, May 17, 2021, Tom Lendacky wrote:
> On 5/14/21 6:06 PM, Peter Gonda wrote:
> > On Fri, May 14, 2021 at 1:22 PM Tom Lendacky <thomas.lendacky@xxxxxxx> wrote:
> >>
> >> Currently, an SEV-ES guest is terminated if the validation of the VMGEXIT
> >> exit code and parameters fail. Since the VMGEXIT instruction can be issued
> >> from userspace, even though userspace (likely) can't update the GHCB,
> >> don't allow userspace to be able to kill the guest.
> >>
> >> Return a #GP request through the GHCB when validation fails, rather than
> >> terminating the guest.
> >
> > Is this a gap in the spec? I don't see anything that details what
> > should happen if the correct fields for NAE are not set in the first
> > couple paragraphs of section 4 'GHCB Protocol'.
>
> No, I don't think the spec needs to spell out everything like this. The
> hypervisor is free to determine its course of action in this case.

The hypervisor can decide whether to inject/return an error or kill the guest,
but what errors can be returned and how they're returned absolutely needs to be
ABI between guest and host, and to make the ABI vendor agnostic the GHCB spec
is the logical place to define said ABI.

For example, "injecting" #GP if the guest botched the GHCB on #VMGEXIT(CPUID) is
completely nonsensical. As is, a Linux guest appears to blindly forward the #GP,
which means if something does go awry KVM has just made debugging the guest that
much harder, e.g. imagine the confusion that will ensue if the end result is a
SIGBUS to userspace on CPUID.

There needs to be an explicit error code for "you gave me bad data", otherwise
we're signing ourselves up for future pain.

> I suppose the spec could suggest a course of action, but I don't think the
> spec should require a specific course of action.
>
> Thanks,
> Tom
>
> >