Re: [PATCH v2 0/5] perf: KVM: Enable callchains for guests
From: Marc Zyngier
Date: Wed Oct 11 2023 - 12:45:49 EST
On Sun, 08 Oct 2023 15:48:17 +0100,
Tianyi Liu <i.pear@xxxxxxxxxxx> wrote:
>
> Hi there,
>
> This series of patches enables callchains for guests (used by perf kvm),
> which holds the top spot on the perf wiki TODO list [1]. This allows users
> to perform guest OS callchain or performance analysis from external
> using PMU events.
>
> The event processing flow is as follows (shown as backtrace):
> #0 kvm_arch_vcpu_get_frame_pointer / kvm_arch_vcpu_read_virt (per arch)
> #1 kvm_guest_get_frame_pointer / kvm_guest_read_virt
> <callback function pointers in `struct perf_guest_info_callbacks`>
> #2 perf_guest_get_frame_pointer / perf_guest_read_virt
> #3 perf_callchain_guest
> #4 get_perf_callchain
> #5 perf_callchain
>
> Between #0 and #1 is the interface between KVM and the arch-specific
> impl, while between #1 and #2 is the interface between Perf and KVM.
> The 1st patch implements #0. The 2nd patch extends interfaces between #1
> and #2, while the 3rd patch implements #1. The 4th patch implements #3
> and modifies #4 #5. The last patch is for userspace utils.
>
> Since arm64 hasn't provided some foundational infrastructure (interface
> for reading from a virtual address of guest), the arm64 implementation
> is stubbed for now because it's a bit complex, and will be implemented
> later.
I hope you realise that such an "interface" would be, by definition,
fragile and very likely to break in a subtle way. The only existing
case where we walk the guest's page tables is for NV, and even that is
extremely fragile.
Given that, I really wonder why this needs to happen in the kernel.
Userspace has all the required information to interrupt a vcpu and
walk its current context, without any additional kernel support. What
are the bits here that cannot be implemented anywhere else?
>
> Tested with both 32-bit and 64-bit guest operating systems / unikernels,
> that `perf script` could correctly show the certain callchains.
> FlameGraphs can also be generated with this series of patches and [2].
>
> Any feedback will be greatly appreciated.
>
> [1] https://perf.wiki.kernel.org/index.php/Todo
> [2] https://github.com/brendangregg/FlameGraph
>
> v1:
> https://lore.kernel.org/kvm/SYYP282MB108686A73C0F896D90D246569DE5A@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/
>
> Changes since v1:
> - v1 only includes partial KVM modifications, while v2 is a complete
> implementation. Also updated based on Sean's feedback.
>
> Tianyi Liu (5):
> KVM: Add arch specific interfaces for sampling guest callchains
> perf kvm: Introduce guest interfaces for sampling callchains
> KVM: implement new perf interfaces
> perf kvm: Support sampling guest callchains
> perf tools: Support PERF_CONTEXT_GUEST_* flags
>
> arch/arm64/kvm/arm.c | 17 +++++++++
Given that there is more to KVM than just arm64 and x86, I suggest
that you move the lack of support for this feature into the main KVM
code.
Thanks,
M.
--
Without deviation from the norm, progress is not possible.