[PATCH v4 00/17] perf: KVM: Fix, optimize, and clean up callbacks

From: Sean Christopherson
Date: Wed Nov 10 2021 - 21:07:44 EST


This is a combination of ~2 series to fix bugs in the perf+KVM callbacks,
optimize the callbacks by employing static_call, and do a variety of
cleanup in both perf and KVM.

For the non-perf patches, I think everything except patch 13 (Paolo) and
patches 15 and 16 (Marc) has the appropriate acks.

Patch 1 fixes a set of mostly-theoretical bugs by protecting the guest
callbacks pointer with RCU.

Patches 2 and 3 fix an Intel PT handling bug where KVM incorrectly
eats PT interrupts when PT is supposed to be owned entirely by the host.

Patches 4-9 clean up perf's callback infrastructure and switch to
static_call for arm64 and x86 (the only survivors).

Patches 10-17 clean up related KVM code and unify the arm64/x86 callbacks.

Based on Linus' tree, commit cb690f5238d7 ("Merge tag 'for-5.16/drivers...).

v4:
- Rebase.
- Collect acks and reviews.
- Fully protect perf_guest_cbs with RCU. [Paolo].
- Add patch to hide arm64's kvm_arm_pmu_available behind
CONFIG_HW_PERF_EVENTS=y.

v3:
- https://lore.kernel.org/all/20210922000533.713300-1-seanjc@xxxxxxxxxx/
- Add wrappers for guest callbacks to that stubs can be provided when
GUEST_PERF_EVENTS=n.
- s/HAVE_GUEST_PERF_EVENTS/GUEST_PERF_EVENTS and select it from KVM
and XEN_PV instead of from top-level arm64/x86. [Paolo]
- Drop an unnecessary synchronize_rcu() when registering callbacks. [Peter]
- Retain a WARN_ON_ONCE() when unregistering callbacks if the caller
didn't provide the correct pointer. [Peter]
- Rework the static_call patch to move it all to common perf.
- Add a patch to drop the (un)register stubs, made possible after
having KVM+XEN_PV select GUEST_PERF_EVENTS.
- Split dropping guest callback "support" for arm, csky, etc... to a
separate patch, to make introducing GUEST_PERF_EVENTS cleaner.

v2 (relative to static_call v10):
- Split the patch into the semantic change (multiplexed ->state) and
introduction of static_call.
- Don't use '0' for "not a guest RIP".
- Handle unregister path.
- Drop changes for architectures that can be culled entirely.

v2 (relative to v1):
- https://lkml.kernel.org/r/20210828003558.713983-6-seanjc@xxxxxxxxxx
- Drop per-cpu approach. [Peter]
- Fix mostly-theoretical reload and use-after-free with READ_ONCE(),
WRITE_ONCE(), and synchronize_rcu(). [Peter]
- Avoid new exports like the plague. [Peter]

v1:
- https://lkml.kernel.org/r/20210827005718.585190-1-seanjc@xxxxxxxxxx

v10 static_call:
- https://lkml.kernel.org/r/20210806133802.3528-2-lingshan.zhu@xxxxxxxxx

Like Xu (1):
perf/core: Rework guest callbacks to prepare for static_call support

Sean Christopherson (16):
perf: Protect perf_guest_cbs with RCU
KVM: x86: Register perf callbacks after calling vendor's
hardware_setup()
KVM: x86: Register Processor Trace interrupt hook iff PT enabled in
guest
perf: Stop pretending that perf can handle multiple guest callbacks
perf: Drop dead and useless guest "support" from arm, csky, nds32 and
riscv
perf: Add wrappers for invoking guest callbacks
perf: Force architectures to opt-in to guest callbacks
perf/core: Use static_call to optimize perf_guest_info_callbacks
KVM: x86: Drop current_vcpu for kvm_running_vcpu + kvm_arch_vcpu
variable
KVM: x86: More precisely identify NMI from guest when handling PMI
KVM: Move x86's perf guest info callbacks to generic KVM
KVM: x86: Move Intel Processor Trace interrupt handler to vmx.c
KVM: arm64: Convert to the generic perf callbacks
KVM: arm64: Hide kvm_arm_pmu_available behind CONFIG_HW_PERF_EVENTS=y
KVM: arm64: Drop perf.c and fold its tiny bits of code into arm.c
perf: Drop guest callback (un)register stubs

arch/arm/kernel/perf_callchain.c | 28 ++------------
arch/arm64/include/asm/kvm_host.h | 11 +++++-
arch/arm64/kernel/image-vars.h | 2 +
arch/arm64/kernel/perf_callchain.c | 13 ++++---
arch/arm64/kvm/Kconfig | 1 +
arch/arm64/kvm/Makefile | 2 +-
arch/arm64/kvm/arm.c | 10 ++++-
arch/arm64/kvm/perf.c | 59 ------------------------------
arch/arm64/kvm/pmu-emul.c | 2 +
arch/csky/kernel/perf_callchain.c | 10 -----
arch/nds32/kernel/perf_event_cpu.c | 29 ++-------------
arch/riscv/kernel/perf_callchain.c | 10 -----
arch/x86/events/core.c | 13 ++++---
arch/x86/events/intel/core.c | 5 +--
arch/x86/include/asm/kvm_host.h | 7 +++-
arch/x86/kvm/Kconfig | 1 +
arch/x86/kvm/pmu.c | 2 +-
arch/x86/kvm/svm/svm.c | 2 +-
arch/x86/kvm/vmx/vmx.c | 25 ++++++++++++-
arch/x86/kvm/x86.c | 58 +++++------------------------
arch/x86/kvm/x86.h | 17 +++++++--
arch/x86/xen/Kconfig | 1 +
arch/x86/xen/pmu.c | 32 +++++++---------
include/kvm/arm_pmu.h | 19 ++++++----
include/linux/kvm_host.h | 10 +++++
include/linux/perf_event.h | 44 ++++++++++++++++------
init/Kconfig | 4 ++
kernel/events/core.c | 41 +++++++++++++++------
virt/kvm/kvm_main.c | 44 ++++++++++++++++++++++
29 files changed, 246 insertions(+), 256 deletions(-)
delete mode 100644 arch/arm64/kvm/perf.c

--
2.34.0.rc0.344.g81b53c2807-goog