[PATCH] KVM: x86: Use get_cpl directly in case of vcpu_load to improve accuracy
From: Like Xu
Date: Thu Nov 23 2023 - 02:58:50 EST
From: Like Xu <likexu@xxxxxxxxxxx>
When vcpu is consistent with kvm_get_running_vcpu(), use get_cpl directly
to return the current exact state for the callers of vcpu_in_kernel API.
In scenarios where VM payload is profiled via perf-kvm, it's noticed that
the value of vcpu->arch.preempted_in_kernel is not strictly synchronised
with current vcpu_cpl.
This affects perf/core's ability to make use of the kvm_guest_state() API
to tag guest RIP with PERF_RECORD_MISC_GUEST_{KERNEL|USER} and record it
in the sample. This causes perf/tool to fail to connect the vcpu RIPs to
the guest kernel space symbols when parsing these samples due to incorrect
PERF_RECORD_MISC flags:
Before (perf-report of a cpu-cycles sample):
1.23% :58945 [unknown] [u] 0xffffffff818012e0
Given the semantics of preempted_in_kernel, it may not be easy (w/o extra
effort) to reconcile changes between preempted_in_kernel and CPL values.
Therefore to make this API more trustworthy, fallback to using get_cpl()
directly when the vcpu is loaded:
After:
1.35% :60703 [kernel.vmlinux] [g] asm_exc_page_fault
More performance squeezing is clearly possible, with priority given to
correcting its accuracy as a basic move.
Signed-off-by: Like Xu <likexu@xxxxxxxxxxx>
---
arch/x86/kvm/x86.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 2c924075f6f1..c454df904a45 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -13031,7 +13031,10 @@ bool kvm_arch_vcpu_in_kernel(struct kvm_vcpu *vcpu)
if (vcpu->arch.guest_state_protected)
return true;
- return vcpu->arch.preempted_in_kernel;
+ if (vcpu != kvm_get_running_vcpu())
+ return vcpu->arch.preempted_in_kernel;
+
+ return static_call(kvm_x86_get_cpl)(vcpu) == 0;
}
unsigned long kvm_arch_vcpu_get_ip(struct kvm_vcpu *vcpu)
base-commit: 45b890f7689eb0aba454fc5831d2d79763781677
--
2.43.0