[GIT PULL] Performance events changes for v6.12
From: Ingo Molnar
Date: Wed Sep 18 2024 - 08:45:51 EST
Linus,
Please pull the latest perf/core Git tree from:
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git perf-core-2024-09-18
# HEAD: 5e645f31139183ac9a282238da18ca6bbc1c6f4a Merge branch 'perf/urgent' into perf/core, to pick up fixes
[ Merge note: this pull request depends on you having pulled perf-urgent-2024-09-18 already. ]
Performance events changes for v6.12:
- Implement per-PMU context rescheduling to significantly improve single-PMU
performance, and related cleanups/fixes. (by Peter Zijlstra and Namhyung Kim)
- Fix ancient bug resulting in a lot of events being dropped erroneously
at higher sampling frequencies. (by Luo Gengkun)
- uprobes enhancements:
- Implement RCU-protected hot path optimizations for better performance:
"For baseline vs SRCU, peak througput increased from 3.7 M/s (million uprobe
triggerings per second) up to about 8 M/s. For uretprobes it's a bit more
modest with bump from 2.4 M/s to 5 M/s.
For SRCU vs RCU Tasks Trace, peak throughput for uprobes increases further from
8 M/s to 10.3 M/s (+28%!), and for uretprobes from 5.3 M/s to 5.8 M/s (+11%),
as we have more work to do on uretprobes side.
Even single-thread (no contention) performance is slightly better: 3.276 M/s to
3.396 M/s (+3.5%) for uprobes, and 2.055 M/s to 2.174 M/s (+5.8%)
for uretprobes."
(by Andrii Nakryiko et al)
- Document mmap_lock, don't abuse get_user_pages_remote(). (by Oleg Nesterov)
- Cleanups & fixes to prepare for future work:
- Remove uprobe_register_refctr()
- Simplify error handling for alloc_uprobe()
- Make uprobe_register() return struct uprobe *
- Fold __uprobe_unregister() into uprobe_unregister()
- Shift put_uprobe() from delete_uprobe() to uprobe_unregister()
- BPF: Fix use-after-free in bpf_uprobe_multi_link_attach()
(by Oleg Nesterov)
- New feature & ABI extension: allow events to use PERF_SAMPLE READ with
inheritance, enabling sample based profiling of a group of counters over
a hierarchy of processes or threads. (by Ben Gainey)
- Intel uncore & power events updates:
- Add Arrow Lake and Lunar Lake support
- Add PERF_EV_CAP_READ_SCOPE
- Clean up and enhance cpumask and hotplug support
(by Kan Liang)
- Add LNL uncore iMC freerunning support
- Use D0:F0 as a default device
(by Zhenyu Wang)
- Intel PT: fix AUX snapshot handling race. (by Adrian Hunter)
- Misc fixes and cleanups. (by James Clark, Jiri Olsa, Oleg Nesterov and Peter Zijlstra)
Thanks,
Ingo
------------------>
Adrian Hunter (1):
perf/x86/intel/pt: Fix sampling synchronization
Andrii Nakryiko (7):
perf,x86: avoid missing caller address in stack traces captured in uprobe
uprobes: simplify error handling for alloc_uprobe()
uprobes: revamp uprobe refcounting and lifetime management
uprobes: protected uprobe lifetime with SRCU
uprobes: get rid of enum uprobe_filter_ctx in uprobe filter callbacks
uprobes: travers uprobe's consumer list locklessly under SRCU protection
uprobes: perform lockless SRCU-protected uprobes_tree lookup
Ben Gainey (2):
perf: Rename perf_event_context.nr_pending to nr_no_switch_fast.
perf: Support PERF_SAMPLE_READ with inherit
Ingo Molnar (2):
Merge branch 'perf/urgent' into perf/core, to pick up fixes
Merge branch 'perf/urgent' into perf/core, to pick up fixes
James Clark (1):
perf/x86/intel/bts: Fix comment about default perf_event_paranoid setting
Jiri Olsa (1):
selftests/bpf: fix uprobe.path leak in bpf_testmod
Kan Liang (8):
perf/x86/intel/uncore: Add Arrow Lake support
perf/x86/intel/uncore: Factor out common MMIO init and ops functions
perf/x86/intel/uncore: Add Lunar Lake support
perf: Generic hotplug support for a PMU with a scope
perf: Add PERF_EV_CAP_READ_SCOPE
perf/x86/intel/cstate: Clean up cpumask and hotplug
iommu/vt-d: Clean up cpumask and hotplug for perfmon
dmaengine: idxd: Clean up cpumask and hotplug for perfmon
Luo Gengkun (1):
perf/core: Fix small negative period being ignored
Namhyung Kim (1):
perf: Really fix event_function_call() locking
Oleg Nesterov (8):
uprobes: document the usage of mm->mmap_lock
uprobes: is_trap_at_addr: don't use get_user_pages_remote()
uprobes: kill uprobe_register_refctr()
uprobes: make uprobe_register() return struct uprobe *
uprobes: change uprobe_register() to use uprobe_unregister() instead of __uprobe_unregister()
uprobes: fold __uprobe_unregister() into uprobe_unregister()
uprobes: shift put_uprobe() from delete_uprobe() to uprobe_unregister()
bpf: Fix use-after-free in bpf_uprobe_multi_link_attach()
Peter Zijlstra (8):
perf/x86: Add hw_perf_event::aux_config
perf: Optimize context reschedule for single PMU cases
perf: Extract a few helpers
perf: Fix event_function_call() locking
perf: Add context time freeze
perf: Optimize __pmu_ctx_sched_out()
perf/uprobe: split uprobe_unregister()
rbtree: provide rb_find_rcu() / rb_find_add_rcu()
Zhenyu Wang (2):
perf/x86/intel/uncore: Add LNL uncore iMC freerunning support
perf/x86/intel/uncore: Use D0:F0 as a default device
arch/x86/events/core.c | 63 +++++++++++++++++++
arch/x86/events/intel/bts.c | 3 -
arch/x86/events/intel/cstate.c | 142 ++-----------------------------------------
arch/x86/events/intel/pt.c | 29 +++++----
arch/x86/events/intel/uncore.c | 9 +++
arch/x86/events/intel/uncore.h | 2 +
arch/x86/events/intel/uncore_snb.c | 185 ++++++++++++++++++++++++++++++++++++++++++++++++++------
drivers/dma/idxd/idxd.h | 7 ---
drivers/dma/idxd/init.c | 3 -
drivers/dma/idxd/perfmon.c | 98 +-----------------------------
drivers/iommu/intel/iommu.h | 2 -
drivers/iommu/intel/perfmon.c | 111 +---------------------------------
include/linux/cpuhotplug.h | 2 -
include/linux/perf_event.h | 32 +++++++++-
include/linux/rbtree.h | 67 +++++++++++++++++++++
include/linux/uprobes.h | 48 ++++++++-------
kernel/events/core.c | 586 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-------------------------------------------------
kernel/events/uprobes.c | 505 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++--------------------------------------------------------------------
kernel/trace/bpf_trace.c | 38 ++++++------
kernel/trace/trace_uprobe.c | 44 +++++++-------
tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c | 27 +++++----
21 files changed, 1146 insertions(+), 857 deletions(-)