Re: [PATCH 1/2] KVM: x86: allow guest to send its _stext for kvm profiling

From: Sean Christopherson
Date: Mon May 09 2022 - 19:55:14 EST


On Tue, Apr 12, 2022, Wei Zhang wrote:
> The profiling buffer is indexed by (pc - _stext) in do_profile_hits(),
> which doesn't work for KVM profiling because the pc represents an address
> in the guest kernel. readprofile is broken in this case, unless the guest
> kernel happens to have the same _stext as the host kernel.
>
> This patch adds a new hypercall so guests could send its _stext to the
> host, which will then be used to adjust the calculation for KVM profiling.

Disclaimer, I know nothing about using profiling.

Why not just omit the _stext adjustment and profile the raw guest RIP? It seems
like userspace needs to know about the guest layout in order to make use of profling
info, so why not report raw info and let host userspace do all adjustments?

> Signed-off-by: Wei Zhang <zhanwei@xxxxxxxxxx>
> ---
> arch/x86/kvm/x86.c | 15 +++++++++++++++
> include/linux/kvm_host.h | 4 ++++
> include/uapi/linux/kvm_para.h | 1 +
> virt/kvm/Kconfig | 5 +++++
> 4 files changed, 25 insertions(+)
>
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 547ba00ef64f..abeacdd5d362 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -9246,6 +9246,12 @@ int kvm_emulate_hypercall(struct kvm_vcpu *vcpu)
> vcpu->arch.complete_userspace_io = complete_hypercall_exit;
> return 0;
> }
> +#ifdef CONFIG_ACCURATE_KVM_PROFILING
> + case KVM_HC_GUEST_STEXT:
> + vcpu->kvm->guest_stext = a0;

Rather than snapshot the guest offset, snapshot the delta. E.g.

vcpu->kvm->arch.guest_stext_offset = (unsigned long)_stext - a0;

Then the profiling flow can just be

unsigned long rip;

rip = kvm_rip_read(vcpu) + vcpu->kvm->arch.guest_text_offset;
profile_hit(KVM_PROFILING, (void *)rip);


> + ret = 0;
> + break;
> +#endif
> default:
> ret = -KVM_ENOSYS;
> break;
> @@ -10261,6 +10267,15 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
> */
> if (unlikely(prof_on == KVM_PROFILING)) {
> unsigned long rip = kvm_rip_read(vcpu);
> +#ifdef CONFIG_ACCURATE_KVM_PROFILING

A Kconfig, and really any #define, is completely unnecessary. This is all x86
code, just throw the offest into struct kvm_arch.