Re: [PATCH v15 22/23] KVM: SEV: Fix return code interpretation for RMP nested page faults

From: Paolo Bonzini
Date: Fri May 10 2024 - 12:02:10 EST

Next message: Andy Shevchenko: "[PATCH v1 1/1] gpiolib: Show more info for interrupt only lines in debugfs"
Previous message: John Johansen: "Re: [PATCH] apparmor: fix possible NULL pointer dereference"
In reply to: Michael Roth: "Re: [PATCH v15 22/23] KVM: SEV: Fix return code interpretation for RMP nested page faults"
Next in thread: Michael Roth: "Re: [PATCH v15 22/23] KVM: SEV: Fix return code interpretation for RMP nested page faults"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On 5/10/24 15:58, Sean Christopherson wrote:

On Thu, May 09, 2024, Michael Roth wrote:

The intended logic when handling #NPFs with the RMP bit set (31) is to
first check to see if the #NPF requires a shared<->private transition
and, if so, to go ahead and let the corresponding KVM_EXIT_MEMORY_FAULT
get forwarded on to userspace before proceeding with any handling of
other potential RMP fault conditions like needing to PSMASH the RMP
entry/etc (which will be done later if the guest still re-faults after
the KVM_EXIT_MEMORY_FAULT is processed by userspace).

The determination of whether any userspace handling of
KVM_EXIT_MEMORY_FAULT is needed is done by interpreting the return code
of kvm_mmu_page_fault(). However, the current code misinterprets the
return code, expecting 0 to indicate a userspace exit rather than less
than 0 (-EFAULT). This leads to the following unexpected behavior:

- for KVM_EXIT_MEMORY_FAULTs resulting for implicit shared->private
conversions, warnings get printed from sev_handle_rmp_fault()
because it does not expect to be called for GPAs where
KVM_MEMORY_ATTRIBUTE_PRIVATE is not set. Standard linux guests don't
generally do this, but it is allowed and should be handled
similarly to private->shared conversions rather than triggering any
sort of warnings

- if gmem support for 2MB folios is enabled (via currently out-of-tree
code), implicit shared<->private conversions will always result in
a PSMASH being attempted, even if it's not actually needed to
resolve the RMP fault. This doesn't cause any harm, but results in a
needless PSMASH and zapping of the sPTE

Resolve these issues by calling sev_handle_rmp_fault() only when
kvm_mmu_page_fault()'s return code is greater than or equal to 0,
indicating a KVM_MEMORY_EXIT_FAULT/-EFAULT isn't needed. While here,
simplify the code slightly and fix up the associated comments for better
clarity.

Fixes: ccc9d836c5c3 ("KVM: SEV: Add support to handle RMP nested page faults")

Signed-off-by: Michael Roth <michael.roth@xxxxxxx>
---
arch/x86/kvm/svm/svm.c | 10 ++++------
1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 426ad49325d7..9431ce74c7d4 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -2070,14 +2070,12 @@ static int npf_interception(struct kvm_vcpu *vcpu)
svm->vmcb->control.insn_len);
/*
- * rc == 0 indicates a userspace exit is needed to handle page
- * transitions, so do that first before updating the RMP table.
+ * rc < 0 indicates a userspace exit may be needed to handle page
+ * attribute updates, so deal with that first before handling other
+ * potential RMP fault conditions.
*/
- if (error_code & PFERR_GUEST_RMP_MASK) {
- if (rc == 0)
- return rc;
+ if (rc >= 0 && error_code & PFERR_GUEST_RMP_MASK)

This isn't correct either. A return of '0' also indiciates "exit to userspace",
it just doesn't happen with SNP because '0' is returned only when KVM attempts
emulation, and that too gets short-circuited by svm_check_emulate_instruction().

And I would honestly drop the comment, KVM's less-than-pleasant 1/0/-errno return
values overload is ubiquitous enough that it should be relatively self-explanatory.

Or if you prefer to keep a comment, drop the part that specifically calls out
attributes updates, because that incorrectly implies that's the _only_ reason
why KVM checks the return. But my vote is to drop the comment, because it
essentially becomes "don't proceed to step 2 if step 1 failed", which kind of
makes the reader go "well, yeah".

So IIUC you're suggesting

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 426ad49325d7..c39eaeb21981 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -2068,16 +2068,11 @@ static int npf_interception(struct kvm_vcpu *vcpu)
static_cpu_has(X86_FEATURE_DECODEASSISTS) ?
svm->vmcb->control.insn_bytes : NULL,
svm->vmcb->control.insn_len);
+ if (rc <= 0)
+ return rc;
- /*
- * rc == 0 indicates a userspace exit is needed to handle page
- * transitions, so do that first before updating the RMP table.
- */
- if (error_code & PFERR_GUEST_RMP_MASK) {
- if (rc == 0)
- return rc;
+ if (error_code & PFERR_GUEST_RMP_MASK)
sev_handle_rmp_fault(vcpu, fault_address, error_code);
- }
return rc;
}

?

So, we're... a bit tight for 6.10 to include SNP and that is an
understatement. My plan is to merge it for 6.11, but do so
immediately after the merge window ends. In other words, it
is a delay in terms of release but not in terms of time. I
don't want QEMU and kvm-unit-tests work to be delayed any
further, in particular.

Once we sort out the loose ends of patches 21-23, you could send
it as a pull request.

Paolo

Next message: Andy Shevchenko: "[PATCH v1 1/1] gpiolib: Show more info for interrupt only lines in debugfs"
Previous message: John Johansen: "Re: [PATCH] apparmor: fix possible NULL pointer dereference"
In reply to: Michael Roth: "Re: [PATCH v15 22/23] KVM: SEV: Fix return code interpretation for RMP nested page faults"
Next in thread: Michael Roth: "Re: [PATCH v15 22/23] KVM: SEV: Fix return code interpretation for RMP nested page faults"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]