Re: [PATCH v15 22/23] KVM: SEV: Fix return code interpretation for RMP nested page faults

From: Sean Christopherson
Date: Fri May 10 2024 - 09:58:59 EST


On Thu, May 09, 2024, Michael Roth wrote:
> The intended logic when handling #NPFs with the RMP bit set (31) is to
> first check to see if the #NPF requires a shared<->private transition
> and, if so, to go ahead and let the corresponding KVM_EXIT_MEMORY_FAULT
> get forwarded on to userspace before proceeding with any handling of
> other potential RMP fault conditions like needing to PSMASH the RMP
> entry/etc (which will be done later if the guest still re-faults after
> the KVM_EXIT_MEMORY_FAULT is processed by userspace).
>
> The determination of whether any userspace handling of
> KVM_EXIT_MEMORY_FAULT is needed is done by interpreting the return code
> of kvm_mmu_page_fault(). However, the current code misinterprets the
> return code, expecting 0 to indicate a userspace exit rather than less
> than 0 (-EFAULT). This leads to the following unexpected behavior:
>
> - for KVM_EXIT_MEMORY_FAULTs resulting for implicit shared->private
> conversions, warnings get printed from sev_handle_rmp_fault()
> because it does not expect to be called for GPAs where
> KVM_MEMORY_ATTRIBUTE_PRIVATE is not set. Standard linux guests don't
> generally do this, but it is allowed and should be handled
> similarly to private->shared conversions rather than triggering any
> sort of warnings
>
> - if gmem support for 2MB folios is enabled (via currently out-of-tree
> code), implicit shared<->private conversions will always result in
> a PSMASH being attempted, even if it's not actually needed to
> resolve the RMP fault. This doesn't cause any harm, but results in a
> needless PSMASH and zapping of the sPTE
>
> Resolve these issues by calling sev_handle_rmp_fault() only when
> kvm_mmu_page_fault()'s return code is greater than or equal to 0,
> indicating a KVM_MEMORY_EXIT_FAULT/-EFAULT isn't needed. While here,
> simplify the code slightly and fix up the associated comments for better
> clarity.
>
> Fixes: ccc9d836c5c3 ("KVM: SEV: Add support to handle RMP nested page faults")
>
> Signed-off-by: Michael Roth <michael.roth@xxxxxxx>
> ---
> arch/x86/kvm/svm/svm.c | 10 ++++------
> 1 file changed, 4 insertions(+), 6 deletions(-)
>
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index 426ad49325d7..9431ce74c7d4 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -2070,14 +2070,12 @@ static int npf_interception(struct kvm_vcpu *vcpu)
> svm->vmcb->control.insn_len);
>
> /*
> - * rc == 0 indicates a userspace exit is needed to handle page
> - * transitions, so do that first before updating the RMP table.
> + * rc < 0 indicates a userspace exit may be needed to handle page
> + * attribute updates, so deal with that first before handling other
> + * potential RMP fault conditions.
> */
> - if (error_code & PFERR_GUEST_RMP_MASK) {
> - if (rc == 0)
> - return rc;
> + if (rc >= 0 && error_code & PFERR_GUEST_RMP_MASK)

This isn't correct either. A return of '0' also indiciates "exit to userspace",
it just doesn't happen with SNP because '0' is returned only when KVM attempts
emulation, and that too gets short-circuited by svm_check_emulate_instruction().

And I would honestly drop the comment, KVM's less-than-pleasant 1/0/-errno return
values overload is ubiquitous enough that it should be relatively self-explanatory.

Or if you prefer to keep a comment, drop the part that specifically calls out
attributes updates, because that incorrectly implies that's the _only_ reason
why KVM checks the return. But my vote is to drop the comment, because it
essentially becomes "don't proceed to step 2 if step 1 failed", which kind of
makes the reader go "well, yeah".