[PATCH RFC 0/8] KVM: x86/pmu: Enable Fixed Counter3 and Topdown Perf Metrics

From: Like Xu
Date: Mon Dec 12 2022 - 07:59:25 EST


Hi,

The Ice Lake core PMU provides built-in support for Top-down u-arch
Analysis (TMA) method level 1 metrics. These metrics are always available
to cross-validate performance observations, freeing general purpose
counters to count other events in high counter utilization scenarios.
For more details about the method, refer to Top-Down Analysis Method
chapter (Appendix B.1) of the Intel® 64 and IA-32 Architectures
Optimization Reference Manual. (SDM 19.3.9.3 Performance Metrics)

This patchset enables Intel Guest Topdow for KVM-based guests. Its basic
enabling framework remains unchanged, a perf_metric msr is introduced,
a group (rather than one) of perf_events is created in KVM by binding to
fiexed counter3 to obtain hardware resources, and the guest value of
perf_metric msr is assembled based on the count of grouped perf_events.

On KVM, patches 0004/5/6 may be reviewd independently if KVM only
enable fixed counter3 as normal slot event for count and sampling.
Patch 7 updates the infrastructure for creating grouped events in KVM,
and patch 8 uses group events to emulate guest MSR_PERF_METRICS.

On Perf, Patches 0001-0003 are awaiting review for tip/perf/core, and
could be accepted separately if they make sense. TBH, I don't think our
perf/core is fully prepared to support kernel space grouped counters,
considering comments around perf_enable_diasable(). But after much
exploration on my part, this is probably the most promising way to get
KVM to create slots plus metrics events. If the addition of *group_leader
messes things up, please shout at me on your needs.

More details in each commit messages may answer code-related questions.

A classic perf tool usage on a linux guest is as follows:
$ perf stat --topdown --td-level=1 -I1000 --no-metric-only sleep 1
# time counts unit events
1.000548528 34,505,682 slots
1.000548528 14,208,222 topdown-retiring # 41.5% Retiring
1.000548528 1,623,796 topdown-bad-spec # 4.7% Bad Speculation
1.000548528 14,614,171 topdown-fe-bound # 42.7% Frontend Bound
1.000548528 3,788,859 topdown-be-bound # 11.1% Backend Bound

Related KUT will follow if there are no obstructive negative comments.

Nit, pre-patches includes:
https://lore.kernel.org/kvm/20221207071506.15733-2-likexu@xxxxxxxxxxx/
https://lore.kernel.org/kvm/20221205122048.16023-1-likexu@xxxxxxxxxxx/

Please feel free to comment and share your feedback.

Thanks,

Like Xu (8):
perf/core: Add *group_leader to perf_event_create_kernel_counter()
perf: x86/core: Expose the available number of the Topdown metrics
perf: x86/core: Snyc PERF_METRICS bit together with fixed counter3
KVM: x86/pmu: Add Intel CPUID-hinted Topdown Slots event
KVM: x86/pmu: Add kernel-defined slots event to enable Fixed Counter3
KVM: x86/pmu: properly use INTEL_PMC_FIXED_RDPMC_BASE macro
KVM: x86/pmu: Use flex *event arrays to implement grouped events
KVM: x86/pmu: Add MSR_PERF_METRICS MSR emulation to enable Topdown

arch/arm64/kvm/pmu-emul.c | 4 +-
arch/x86/events/core.c | 1 +
arch/x86/events/intel/core.c | 3 +
arch/x86/include/asm/kvm_host.h | 14 +-
arch/x86/include/asm/msr-index.h | 1 +
arch/x86/include/asm/perf_event.h | 1 +
arch/x86/kernel/cpu/resctrl/pseudo_lock.c | 4 +-
arch/x86/kvm/pmu.c | 149 ++++++++++++++++++++--
arch/x86/kvm/pmu.h | 31 +++--
arch/x86/kvm/svm/pmu.c | 1 +
arch/x86/kvm/vmx/pmu_intel.c | 53 +++++++-
arch/x86/kvm/vmx/vmx.c | 3 +
arch/x86/kvm/x86.c | 9 +-
include/linux/perf_event.h | 1 +
kernel/events/core.c | 4 +-
kernel/events/hw_breakpoint.c | 4 +-
kernel/events/hw_breakpoint_test.c | 2 +-
kernel/watchdog_hld.c | 2 +-
18 files changed, 239 insertions(+), 48 deletions(-)

--
2.38.2