Re: [PATCH v3 1/2] KVM: x86: Check hypercall's exit to userspace generically

From: Binbin Wu
Date: Mon Nov 04 2024 - 04:03:48 EST

Next message: Jinjie Ruan: "[PATCH] PM: EM: Fix uninitialized power in em_create_perf_table"
Previous message: Andreas Kemnade: "Re: [PATCH] dt-bindings: mfd: bd71828: Use charger resistor in mOhm instead of MOhm"
In reply to: Huang, Kai: "Re: [PATCH v3 1/2] KVM: x86: Check hypercall's exit to userspace generically"
Next in thread: Binbin Wu: "Re: [PATCH v3 1/2] KVM: x86: Check hypercall's exit to userspace generically"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On 11/2/2024 5:13 AM, Huang, Kai wrote:

On Fri, 2024-11-01 at 09:39 -0700, Sean Christopherson wrote:

On Fri, Nov 01, 2024, Kai Huang wrote:

On Thu, 2024-10-31 at 07:54 -0700, Sean Christopherson wrote:

On Thu, Oct 31, 2024, Kai Huang wrote:
- ret = __kvm_emulate_hypercall(vcpu, nr, a0, a1, a2, a3, op_64_bit, cpl);
- if (nr == KVM_HC_MAP_GPA_RANGE && !ret)
- /* MAP_GPA tosses the request to the user space. */
- return 0;
+ r = __kvm_emulate_hypercall(vcpu, nr, a0, a1, a2, a3, op_64_bit, cpl, &ret);
+ if (r <= r)
+ return r;

... should be:

if (r <= 0)
return r;

?

Another option might be we move "set hypercall return value" code inside
__kvm_emulate_hypercall(). So IIUC the reason to split
__kvm_emulate_hypercall() out is for TDX, and while non-TDX uses RAX to carry
the hypercall return value, TDX uses R10.

We can additionally pass a "kvm_hypercall_set_ret_func" function pointer to
__kvm_emulate_hypercall(), and invoke it inside. Then we can change
__kvm_emulate_hypercall() to return:
< 0 error,
==0 return to userspace,
> 0 go back to guest.

Hmm, and the caller can still handle kvm_skip_emulated_instruction(), because the
return value is KVM's normal pattern.

I like it!

But, there's no need to pass a function pointer, KVM can write (and read) arbitrary
GPRs, it's just avoided in most cases so that the sanity checks and available/dirty
updates are elided. For this code though, it's easy enough to keep kvm_rxx_read()
for getting values, and eating the overhead of a single GPR write is a perfectly
fine tradeoff for eliminating the return multiplexing.

Lightly tested. Assuming this works for TDX and passes testing, I'll post a
mini-series next week.

--
From: Sean Christopherson <seanjc@xxxxxxxxxx>
Date: Fri, 1 Nov 2024 09:04:00 -0700
Subject: [PATCH] KVM: x86: Refactor __kvm_emulate_hypercall() to accept reg
names, not values

Rework __kvm_emulate_hypercall() to take the names of input and output
(guest return value) registers, as opposed to taking the input values and
returning the output value. As part of the refactor, change the actual
return value from __kvm_emulate_hypercall() to be KVM's de facto standard
of '0' == exit to userspace, '1' == resume guest, and -errno == failure.

Using the return value for KVM's control flow eliminates the multiplexed
return value, where '0' for KVM_HC_MAP_GPA_RANGE (and only that hypercall)
means "exit to userspace".

Use the direct GPR accessors to read values to avoid the pointless marking
of the registers as available, but use kvm_register_write_raw() for the
guest return value so that the innermost helper doesn't need to multiplex
its return value. Using the generic kvm_register_write_raw() adds very
minimal overhead, so as a one-off in a relatively slow path it's well
worth the code simplification.

Ah right :-)

Suggested-by: Kai Huang <kai.huang@xxxxxxxxx>
Signed-off-by: Sean Christopherson <seanjc@xxxxxxxxxx>
---

I think Binbin can help to test on TDX, and assuming it works,

I tried to add a selftest case to do memory conversion via kvm hypercall
directly for TDX. And found TDX code didn't handle the return value for
the hypercall properly.

I tried to add a parameter to pass the cui callback as mentioned in
https://lore.kernel.org/lkml/f95cd8c6-af5c-4d8f-99a8-16d0ec56d9a4@xxxxxxxxxxxxxxx/
And then, made the following change in TDX code to make it work.

diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c
index cd27ebd3d7d1..efa434c6547d 100644
--- a/arch/x86/kvm/vmx/tdx.c
+++ b/arch/x86/kvm/vmx/tdx.c
@@ -1072,6 +1072,15 @@ static int tdx_handle_triple_fault(struct kvm_vcpu *vcpu)
        return 0;
}

+static int complete_hypercall_exit(struct kvm_vcpu *vcpu)
+{
+       u64 ret = vcpu->run->hypercall.ret;
+
+       kvm_r10_write(vcpu, ret);
+       ++vcpu->stat.hypercalls;
+
+       return 1;
+}
+
static int tdx_emulate_vmcall(struct kvm_vcpu *vcpu)
{
        int r;
@@ -1087,7 +1096,7 @@ static int tdx_emulate_vmcall(struct kvm_vcpu *vcpu)
         * R10: KVM hypercall number
         * arguments: R11, R12, R13, R14.
         */
-       r = __kvm_emulate_hypercall(vcpu, r10, r11, r12, r13, r14, true, 0, R10);
+       r = __kvm_emulate_hypercall(vcpu, r10, r11, r12, r13, r14, true, 0, R10, complete_hypercall_exit);

        return r > 0;
}

Reviewed-by: Kai Huang <kai.huang@xxxxxxxxx>

Next message: Jinjie Ruan: "[PATCH] PM: EM: Fix uninitialized power in em_create_perf_table"
Previous message: Andreas Kemnade: "Re: [PATCH] dt-bindings: mfd: bd71828: Use charger resistor in mOhm instead of MOhm"
In reply to: Huang, Kai: "Re: [PATCH v3 1/2] KVM: x86: Check hypercall's exit to userspace generically"
Next in thread: Binbin Wu: "Re: [PATCH v3 1/2] KVM: x86: Check hypercall's exit to userspace generically"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]