[PATCH v1 00/16] Address some perf memory/data size issues
From: Ian Rogers
Date: Thu May 25 2023 - 03:12:16 EST
Try to reduce the data size of the perf command. Before these patches
a stripped non-debug binary was:
$ size -A perf
perf :
section size addr
.interp 28 848
.note.gnu.property 32 880
.note.gnu.build-id 36 912
.note.ABI-tag 32 948
.gnu.hash 24628 984
.dynsym 88920 25616
.dynstr 70193 114536
.gnu.version 7410 184730
.gnu.version_r 800 192144
.rela.dyn 460824 192944
.rela.plt 14784 653768
.init 23 671744
.plt 9872 671776
.plt.got 24 681648
.text 2279182 681680
.noinstr.text 476 2960864
.fini 9 2961340
.rodata 7042922 2961408
.eh_frame_hdr 42844 10004332
.eh_frame 226496 10047176
.tbss 48 10279720
.init_array 16 10279720
.fini_array 8 10279736
.data.rel.ro 53376 10279744
.dynamic 736 10333120
.got 328 10333856
.got.plt 4952 10334184
.data 391088 10339136
.bss 285776 10730240
.comment 31 0
Total 11005894
And after:
perf :
section size addr
.interp 28 848
.note.gnu.property 32 880
.note.gnu.build-id 36 912
.note.ABI-tag 32 948
.gnu.hash 24628 984
.dynsym 88944 25616
.dynstr 70217 114560
.gnu.version 7412 184778
.gnu.version_r 816 192192
.rela.dyn 460824 193008
.rela.plt 14808 653832
.init 23 671744
.plt 9888 671776
.plt.got 24 681664
.text 2280446 681696
.noinstr.text 476 2962144
.fini 9 2962620
.rodata 7048746 2965504
.eh_frame_hdr 42852 10014252
.eh_frame 226568 10057104
.tbss 48 10285640
.init_array 16 10285640
.fini_array 8 10285656
.data.rel.ro 301408 10285664
.dynamic 736 10587072
.got 328 10587808
.got.plt 4960 10588136
.data 100464 10593152
.bss 22512 10693632
.comment 31 0
Total 10707320
The binary has reduced in size by 298,574 bytes. The .bss, that
doesn't count toward file size, is reduced by 263,254 bytes. At
runtime this could reduce the footprint up to 561,828 bytes. This is
still just a fraction of the .rodata section's size of 7,048,746
bytes, that mainly contains the converted json events. The .rodata
section needn't all be mapped at the same time.
The changes are largely removing static variables and replacing them
with local or dynamically allocated memory. A common issue was having
paths in statics for the sake of returning a non-stack pointer to a
buffer, so the APIs were changed to pass buffers in.
Ian Rogers (16):
perf header: Make nodes dynamic in write_mem_topology
perf test x86: insn-x86 test data is immutable so mark it const
perf test x86: intel-pt-test data is immutable so mark it const
perf trace: Make some large static arrays const
perf trace beauty: Make MSR arrays const
tools api fs: Avoid large static PATH_MAX arrays
tools lib api fs tracing_path: Remove two unused MAX_PATH paths
perf daemon: Dynamically allocate path to perf
perf lock: Dynamically allocate lockhash_table
perf timechart: Make large arrays dynamic
perf probe: Dynamically allocate params memory
perf path: Make mkpath thread safe
perf scripting-engines: Move static to local variable
tools api fs: Dynamically allocate cgroupfs mount point cache
perf test pmu: Avoid 2 static path arrays
libsubcmd: Avoid two path statics
tools/lib/api/fs/cgroup.c | 17 ++-
tools/lib/api/fs/fs.c | 25 +++-
tools/lib/api/fs/tracing_path.c | 17 +--
tools/lib/subcmd/exec-cmd.c | 35 +++--
tools/perf/arch/x86/tests/insn-x86.c | 10 +-
tools/perf/arch/x86/tests/intel-pt-test.c | 14 +-
tools/perf/builtin-config.c | 4 +-
tools/perf/builtin-daemon.c | 15 +-
tools/perf/builtin-help.c | 4 +-
tools/perf/builtin-lock.c | 20 ++-
tools/perf/builtin-probe.c | 133 ++++++++++--------
tools/perf/builtin-timechart.c | 48 +++++--
tools/perf/builtin-trace.c | 33 +++--
tools/perf/tests/pmu.c | 17 +--
tools/perf/trace/beauty/beauty.h | 2 +-
.../perf/trace/beauty/tracepoints/x86_msr.sh | 6 +-
tools/perf/util/cache.h | 2 +-
tools/perf/util/config.c | 3 +-
tools/perf/util/header.c | 33 +++--
tools/perf/util/path.c | 35 +----
.../util/scripting-engines/trace-event-perl.c | 4 +-
.../scripting-engines/trace-event-python.c | 5 +-
22 files changed, 278 insertions(+), 204 deletions(-)
--
2.40.1.698.g37aff9b760-goog