On Fri, 2024-11-01 at 09:39 -0700, Sean Christopherson wrote:I tried to add a selftest case to do memory conversion via kvm hypercall
On Fri, Nov 01, 2024, Kai Huang wrote:Ah right :-)
On Thu, 2024-10-31 at 07:54 -0700, Sean Christopherson wrote:Hmm, and the caller can still handle kvm_skip_emulated_instruction(), because the
On Thu, Oct 31, 2024, Kai Huang wrote:... should be:
- ret = __kvm_emulate_hypercall(vcpu, nr, a0, a1, a2, a3, op_64_bit, cpl);
- if (nr == KVM_HC_MAP_GPA_RANGE && !ret)
- /* MAP_GPA tosses the request to the user space. */
- return 0;
+ r = __kvm_emulate_hypercall(vcpu, nr, a0, a1, a2, a3, op_64_bit, cpl, &ret);
+ if (r <= r)
+ return r;
if (r <= 0)
return r;
?
Another option might be we move "set hypercall return value" code inside
__kvm_emulate_hypercall(). So IIUC the reason to split
__kvm_emulate_hypercall() out is for TDX, and while non-TDX uses RAX to carry
the hypercall return value, TDX uses R10.
We can additionally pass a "kvm_hypercall_set_ret_func" function pointer to
__kvm_emulate_hypercall(), and invoke it inside. Then we can change
__kvm_emulate_hypercall() to return:
< 0 error,
==0 return to userspace,
> 0 go back to guest.
return value is KVM's normal pattern.
I like it!
But, there's no need to pass a function pointer, KVM can write (and read) arbitrary
GPRs, it's just avoided in most cases so that the sanity checks and available/dirty
updates are elided. For this code though, it's easy enough to keep kvm_rxx_read()
for getting values, and eating the overhead of a single GPR write is a perfectly
fine tradeoff for eliminating the return multiplexing.
Lightly tested. Assuming this works for TDX and passes testing, I'll post a
mini-series next week.
--
From: Sean Christopherson <seanjc@xxxxxxxxxx>
Date: Fri, 1 Nov 2024 09:04:00 -0700
Subject: [PATCH] KVM: x86: Refactor __kvm_emulate_hypercall() to accept reg
names, not values
Rework __kvm_emulate_hypercall() to take the names of input and output
(guest return value) registers, as opposed to taking the input values and
returning the output value. As part of the refactor, change the actual
return value from __kvm_emulate_hypercall() to be KVM's de facto standard
of '0' == exit to userspace, '1' == resume guest, and -errno == failure.
Using the return value for KVM's control flow eliminates the multiplexed
return value, where '0' for KVM_HC_MAP_GPA_RANGE (and only that hypercall)
means "exit to userspace".
Use the direct GPR accessors to read values to avoid the pointless marking
of the registers as available, but use kvm_register_write_raw() for the
guest return value so that the innermost helper doesn't need to multiplex
its return value. Using the generic kvm_register_write_raw() adds very
minimal overhead, so as a one-off in a relatively slow path it's well
worth the code simplification.
Suggested-by: Kai Huang <kai.huang@xxxxxxxxx>I think Binbin can help to test on TDX, and assuming it works,
Signed-off-by: Sean Christopherson <seanjc@xxxxxxxxxx>
---
Reviewed-by: Kai Huang <kai.huang@xxxxxxxxx>