Re: [RFC PATCH 11/35] KVM: SVM: Prepare for SEV-ES exit handling in the sev.c file
From: Tom Lendacky
Date: Wed Sep 16 2020 - 16:43:16 EST
On 9/15/20 12:21 PM, Sean Christopherson wrote:
> On Mon, Sep 14, 2020 at 03:15:25PM -0500, Tom Lendacky wrote:
>> From: Tom Lendacky <thomas.lendacky@xxxxxxx>
>>
>> This is a pre-patch to consolidate some exit handling code into callable
>> functions. Follow-on patches for SEV-ES exit handling will then be able
>> to use them from the sev.c file.
>>
>> Signed-off-by: Tom Lendacky <thomas.lendacky@xxxxxxx>
>> ---
>> arch/x86/kvm/svm/svm.c | 64 +++++++++++++++++++++++++-----------------
>> 1 file changed, 38 insertions(+), 26 deletions(-)
>>
>> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
>> index f9daa40b3cfc..6a4cc535ba77 100644
>> --- a/arch/x86/kvm/svm/svm.c
>> +++ b/arch/x86/kvm/svm/svm.c
>> @@ -3047,6 +3047,43 @@ static void dump_vmcb(struct kvm_vcpu *vcpu)
>> "excp_to:", save->last_excp_to);
>> }
>>
>> +static bool svm_is_supported_exit(struct kvm_vcpu *vcpu, u64 exit_code)
>> +{
>> + if (exit_code < ARRAY_SIZE(svm_exit_handlers) &&
>> + svm_exit_handlers[exit_code])
>> + return true;
>> +
>> + vcpu_unimpl(vcpu, "svm: unexpected exit reason 0x%llx\n", exit_code);
>> + dump_vmcb(vcpu);
>> + vcpu->run->exit_reason = KVM_EXIT_INTERNAL_ERROR;
>> + vcpu->run->internal.suberror = KVM_INTERNAL_ERROR_UNEXPECTED_EXIT_REASON;
>> + vcpu->run->internal.ndata = 2;
>> + vcpu->run->internal.data[0] = exit_code;
>> + vcpu->run->internal.data[1] = vcpu->arch.last_vmentry_cpu;
>
> Based on the name "is_supported_exit", I would prefer that vcpu->run be filled
> in by the caller. Looking at the below code where svm_is_supported_exit() is
> checked, without diving into the implementation of the helper it's not at all
> clear that vcpu->run is filled.
>
> Assuming svm_invoke_exit_handler() is the only user, it probably makes sense to
> fill vcpu->run in the caller. If there will be multiple callers, then it'd be
> nice to rename svm_is_supported_exit() to e.g. svm_handle_invalid_exit() or so.
Will change.
>
>> +
>> + return false;
>> +}
>> +
>> +static int svm_invoke_exit_handler(struct vcpu_svm *svm, u64 exit_code)
>> +{
>> + if (!svm_is_supported_exit(&svm->vcpu, exit_code))
>> + return 0;
>> +
>> +#ifdef CONFIG_RETPOLINE
>> + if (exit_code == SVM_EXIT_MSR)
>> + return msr_interception(svm);
>> + else if (exit_code == SVM_EXIT_VINTR)
>> + return interrupt_window_interception(svm);
>> + else if (exit_code == SVM_EXIT_INTR)
>> + return intr_interception(svm);
>> + else if (exit_code == SVM_EXIT_HLT)
>> + return halt_interception(svm);
>> + else if (exit_code == SVM_EXIT_NPF)
>> + return npf_interception(svm);
>> +#endif
>> + return svm_exit_handlers[exit_code](svm);
>
> Now I see why kvm_skip_emulated_instruction() is bailing on SEV-ES guests,
> #VMGEXIT simply routes through the legacy exit handlers. Which totally makes
> sense from a code reuse perspective, but the lack of sanity checking with that
> approach is undesirable, e.g. I assume there are a big pile of exit codes that
> are flat out unsupported for SEV-ES, and ideally KVM would yell loudly if it
> tries to do skip_emulated_instruction() for a protected guest.
>
> Rather than route through the legacy handlers, I suspect it will be more
> desirable in the long run to have a separate path for #VMGEXIT, i.e. a path
> that does the back half of emulation (the front half being the "fetch" phase).
Except there are some automatic exits (AE events) that don't go through
VMGEXIT and would need to be sure the RIP isn't updated. I can audit the
AE events and see what's possible.
Additionally, maybe just ensuring that kvm_x86_ops.get_rflags() doesn't
return something with the TF flag set eliminates the need for the change
to kvm_skip_emulated_instruction().
>
> The biggest downsides would be code duplication and ongoing maintenance. Our
> current approach for TDX is to eat that overhead, because it's not _that_ much
> code. But, maybe there's a middle ground, e.g. using the existing flows but
> having them skip (heh) kvm_skip_emulated_instruction() for protected guests.
>
> There are a few flows, e.g. MMIO emulation, that will need dedicated
> implementations, but I'm 99% certain we can put those in x86.c and share them
> between SEV-ES and TDX.
>
> One question that will impact KVM's options: can KVM inject exceptions to
> SEV-ES guests? E.g. if the guest request emulation of a bogus WRMSR, is the
> #GP delivered as an actual #GP, or is the error "returned" via the GHCB?
Yes, for SEV-ES guest, you can inject exceptions. But, when using VMGEXIT
for, e.g. WRMSR, you would pass an exception error code back to the #VC
handler that will propagate that exception in the guest with the registers
associated with the #VC.
Thanks,
Tom
>
> The most annoying hiccup is that TDX doesn't use the "standard" GPRs, e.g. MSR
> index isn't passed via ECX. I'll play around with a common x86.c
> implementation to see how painful it will be to use for TDX. Given that SEV-ES
> is more closely aligned with legacy behavior (in terms of registers usage),
> getting SEV-ES working on a common base should be relatively easy, at least in
> theory :-).
>