[PATCH v1 0/7] PMU performance improvements
From: Ian Rogers
Date: Fri Oct 06 2023 - 22:13:50 EST
Performance improvements to pmu scanning by holding onto the
event/metric tables for a cpuid (avoid regular expression comparisons)
and by lazily computing the default perf_event_attr for a PMU.
Before
% Running 'internals/pmu-scan' benchmark:
Computing performance of sysfs PMU event scan for 100 times
Average core PMU scanning took: 251.990 usec (+- 4.009 usec)
Average PMU scanning took: 3222.460 usec (+- 211.234 usec)
% Running 'internals/pmu-scan' benchmark:
Computing performance of sysfs PMU event scan for 100 times
Average core PMU scanning took: 260.120 usec (+- 7.905 usec)
Average PMU scanning took: 3228.995 usec (+- 211.196 usec)
% Running 'internals/pmu-scan' benchmark:
Computing performance of sysfs PMU event scan for 100 times
Average core PMU scanning took: 252.310 usec (+- 3.980 usec)
Average PMU scanning took: 3220.675 usec (+- 210.844 usec)
After:
% Running 'internals/pmu-scan' benchmark:
Computing performance of sysfs PMU event scan for 100 times
Average core PMU scanning took: 28.530 usec (+- 0.602 usec)
Average PMU scanning took: 275.725 usec (+- 18.253 usec)
% Running 'internals/pmu-scan' benchmark:
Computing performance of sysfs PMU event scan for 100 times
Average core PMU scanning took: 28.720 usec (+- 0.446 usec)
Average PMU scanning took: 271.015 usec (+- 18.762 usec)
% Running 'internals/pmu-scan' benchmark:
Computing performance of sysfs PMU event scan for 100 times
Average core PMU scanning took: 31.040 usec (+- 0.612 usec)
Average PMU scanning took: 267.340 usec (+- 17.209 usec)
Measuring the pmu-scan benchmark on a Tigerlake laptop: core PMU
scanning is reduced to 11.5% of the previous execution time, all PMU
scanning is reduced to 8.4% of the previous execution time. There is a
4.3% reduction in openat system calls.
Ian Rogers (7):
perf pmu: Rename perf_pmu__get_default_config to perf_pmu__arch_init
perf intel-pt: Move PMU initialization from default config code
perf arm-spe: Move PMU initialization from default config code
perf pmu: Const-ify file APIs
perf pmu: Const-ify perf_pmu__config_terms
perf pmu-events: Remember the events and metrics table
perf pmu: Lazily compute default config
tools/perf/arch/arm/util/cs-etm.c | 13 ++------
tools/perf/arch/arm/util/pmu.c | 10 +++---
tools/perf/arch/arm64/util/arm-spe.c | 48 +++++++++++++---------------
tools/perf/arch/s390/util/pmu.c | 3 +-
tools/perf/arch/x86/util/intel-pt.c | 27 +++++++---------
tools/perf/arch/x86/util/pmu.c | 6 ++--
tools/perf/pmu-events/jevents.py | 48 ++++++++++++++++------------
tools/perf/util/arm-spe.h | 4 ++-
tools/perf/util/cs-etm.h | 2 +-
tools/perf/util/intel-pt.h | 3 +-
tools/perf/util/parse-events.c | 12 +++----
tools/perf/util/pmu.c | 39 +++++++++++-----------
tools/perf/util/pmu.h | 18 ++++++-----
tools/perf/util/python.c | 2 +-
14 files changed, 117 insertions(+), 118 deletions(-)
--
2.42.0.609.gbb76f46606-goog