Re: [PATCH] perf record: add a shortcut for metrics

From: Arnaldo Carvalho de Melo
Date: Mon May 27 2024 - 13:28:40 EST


On Mon, May 27, 2024 at 02:04:54PM -0300, Arnaldo Carvalho de Melo wrote:
> On Mon, May 27, 2024 at 02:02:33PM -0300, Arnaldo Carvalho de Melo wrote:
> > On Mon, May 27, 2024 at 12:15:19PM +0200, Artem Savkov wrote:
> > > Add -M/--metrics option to perf-record providing a shortcut to record
> > > metrics and metricgroups. This option mirrors the one in perf-stat.

> > > Suggested-by: Arnaldo Carvalho de Melo <acme@xxxxxxxxxx>
> > > Signed-off-by: Artem Savkov <asavkov@xxxxxxxxxx>

> > Not building for me, I needed to add the rblist.h header and also I
> > think we need to use metricgroup__rblist_init(&mevents), right?

> Argh, that is a static function, it seems we trigger it by having
> nr_entries = 0, so the following should do the trick:

> struct rblist mevents = { .nr_entries = 0, }

> So that we don't depend on the compiler zeroing that field, which for
> local variables it should not.

How did you test this?

I'm trying:

perf list metric

pick a metric then:

perf record -M tma_core_bound

And it gets in a long loop doing perf_event_open() calls...

root@number:~# perf stat -a -M tma_clears_resteers sleep 1

Performance counter stats for 'system wide':

4,248,865,818 cpu_core/TOPDOWN.SLOTS/ # 0.5 % tma_clears_resteers
652,979,004 cpu_core/topdown-retiring/
332,409,986 cpu_core/topdown-bad-spec/
1,535,823,405 cpu_core/topdown-fetch-lat/
322,562,930 cpu_core/topdown-br-mispredict/
1,977,392,925 cpu_core/topdown-fe-bound/
1,301,619,465 cpu_core/topdown-be-bound/
78,222,034 cpu_core/INT_MISC.CLEAR_RESTEER_CYCLES/
727,201,022 cpu_core/CPU_CLK_UNHALTED.THREAD/
105,140,481 cpu_core/INT_MISC.UNKNOWN_BRANCH_CYCLES/
5,067,924 cpu_core/INT_MISC.UOP_DROPPING/

1.002715853 seconds time elapsed

root@number:~# gdb perf
GNU gdb (Fedora Linux) 14.2-1.fc39
Copyright (C) 2023 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from perf...
(gdb) run record -a -M tma_clears_resteers sleep 1
Starting program: /root/bin/perf record -a -M tma_clears_resteers sleep 1

This GDB supports auto-downloading debuginfo from the following URLs:
<https://debuginfod.fedoraproject.org/>
Enable debuginfod for this session? (y or [n]) n
Debuginfod has been disabled.
To make this setting permanent, add 'set debuginfod enabled off' to .gdbinit.
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
[Detaching after fork from child process 688237]
^C
Program received signal SIGINT, Interrupt.
0x00007ffff6f21804 in close () from /lib64/libc.so.6
Missing separate debuginfos, use: dnf debuginfo-install bzip2-libs-1.0.8-16.fc39.x86_64 capstone-4.0.2-15.fc39.x86_64 elfutils-debuginfod-client-0.191-2.fc39.x86_64 elfutils-libelf-0.191-2.fc39.x86_64 elfutils-libs-0.191-2.fc39.x86_64 glib2-2.78.6-1.fc39.x86_64 glibc-2.38-18.fc39.x86_64 keyutils-libs-1.6.3-1.fc39.x86_64 krb5-libs-1.21.2-3.fc39.x86_64 libbabeltrace-1.5.11-5.fc39.x86_64 libcap-2.48-9.fc39.x86_64 libcom_err-1.47.0-2.fc39.x86_64 libcurl-minimal-8.2.1-5.fc39.x86_64 libidn2-2.3.7-1.fc39.x86_64 libnghttp2-1.55.1-5.fc39.x86_64 libpfm-4.13.0-4.fc39.x86_64 libselinux-3.5-5.fc39.x86_64 libstdc++-13.2.1-7.fc39.x86_64 libtraceevent-1.7.2-3.fc39.x86_64 libunistring-1.1-5.fc39.x86_64 libunwind-1.7.0-0.2.rc2.fc39.x86_64 libuuid-2.39.4-1.fc39.x86_64 libzstd-1.5.6-1.fc39.x86_64 numactl-libs-2.0.16-3.fc39.x86_64 opencsd-1.4.0-1.fc39.x86_64 openssl-libs-3.1.1-4.fc39.x86_64 pcre2-10.42-1.fc39.2.x86_64 perl-libs-5.38.2-502.fc39.x86_64 popt-1.19-3.fc39.x86_64 python3-libs-3.12.3-2.fc39.x86_64 slang-2.3.3-4.fc39.x86_64 xz-libs-5.4.4-1.fc39.x86_64 zlib-1.2.13-4.fc39.x86_64
(gdb) bt
#0 0x00007ffff6f21804 in close () from /lib64/libc.so.6
#1 0x000000000061fbd2 in perf_evsel__close_fd_cpu (evsel=0xdab470, cpu_map_idx=6) at evsel.c:188
#2 0x000000000061fc22 in perf_evsel__close_fd (evsel=0xdab470) at evsel.c:197
#3 0x000000000061fc9b in perf_evsel__close (evsel=0xdab470) at evsel.c:211
#4 0x00000000004e0b5f in evlist.reset_weak_group ()
#5 0x0000000000423bb9 in __cmd_record.constprop.0 ()
#6 0x00000000004276c5 in cmd_record ()
#7 0x00000000004c4579 in run_builtin ()
#8 0x00000000004c4889 in handle_internal_command ()
#9 0x0000000000410e57 in main ()
(gdb) c
Continuing.
^C
Program received signal SIGINT, Interrupt.
0x00007ffff6f21804 in close () from /lib64/libc.so.6
(gdb)

So you should investigate this further.

The idea, from my notes, was to be able to have extra columns in 'perf
report' with things like IPC and other metrics, probably not all metrics
will apply. We need to find a way to find out which ones are OK for that
purpose, for instance:

root@number:~# perf stat -a -M tma_branch_resteers sleep 1

Performance counter stats for 'system wide':

209,159,606,886 cpu_core/TOPDOWN.SLOTS/ # 3.2 % tma_branch_resteers
55,156,278,851 cpu_core/topdown-retiring/
7,779,703,706 cpu_core/topdown-bad-spec/
17,644,918,779 cpu_core/topdown-fetch-lat/
39,431,478,422 cpu_core/topdown-fe-bound/
107,325,133,399 cpu_core/topdown-be-bound/
1,066,765,398 cpu_core/INT_MISC.CLEAR_RESTEER_CYCLES/
35,367,316,520 cpu_core/CPU_CLK_UNHALTED.THREAD/
73,066,635 cpu_core/INT_MISC.UNKNOWN_BRANCH_CYCLES/
106,828,690 cpu_core/INT_MISC.UOP_DROPPING/

1.001581758 seconds time elapsed

root@number:~#

But then:

root@number:~# perf record -e cpu_core/TOPDOWN.SLOTS/,cpu_core/topdown-bad-spec/,cpu_core/topdown-fetch-lat/,cpu_core/topdown-fe-bound/,cpu_core/topdown-be-bound/,cpu_core/INT_MISC.CLEAR_RESTEER_CYCLES/,cpu_core/CPU_CLK_UNHALTED.THREAD/,cpu_core/INT_MISC.UNKNOWN_BRANCH_CYCLES/,cpu_core/INT_MISC.UOP_DROPPING/
WARNING: events were regrouped to match PMUs
Error:
The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (cpu_core/topdown-bad-spec/).
/bin/dmesg | grep -i perf may provide additional information.

root@number:~#

That invalid argument error message needs improvement, but its one
example of a metric where events can't be sampled with 'perf record' for
some reason:

Opening: cpu_core/topdown-bad-spec/
------------------------------------------------------------
perf_event_attr:
type 4 (cpu_core)
size 136
config 0x8100 (topdown-bad-spec)
{ sample_period, sample_freq } 4000
sample_type IP|TID|TIME|CPU|PERIOD|IDENTIFIER
read_format ID|LOST
disabled 1
inherit 1
freq 1
sample_id_all 1
exclude_guest 1
------------------------------------------------------------
sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8
sys_perf_event_open failed, error -22
switching off PERF_FORMAT_LOST support
Opening: cpu_core/topdown-bad-spec/

It goes down disabling several perf_event_attr assuming the kernel
doesn't have support for features but ultimately fails and returns the
cryptic EINVAL.

Ian, can you take a look at this:

root@number:~# perf stat -a -M tma_branch_resteers sleep 1

Performance counter stats for 'system wide':

207,780,999,822 cpu_core/TOPDOWN.SLOTS/ # 5.6 % tma_branch_resteers
46,114,346,088 cpu_core/topdown-retiring/
12,533,625,786 cpu_core/topdown-bad-spec/
25,845,036,349 cpu_core/topdown-fetch-lat/
50,198,057,652 cpu_core/topdown-fe-bound/
99,605,368,200 cpu_core/topdown-be-bound/
1,720,994,647 cpu_core/INT_MISC.CLEAR_RESTEER_CYCLES/
39,224,461,225 cpu_core/CPU_CLK_UNHALTED.THREAD/
469,464,484 cpu_core/INT_MISC.UNKNOWN_BRANCH_CYCLES/
260,388,972 cpu_core/INT_MISC.UOP_DROPPING/

1.004820692 seconds time elapsed

root@number:~# perf stat -a -e cpu_core/topdown-bad-spec/ sleep 1

Performance counter stats for 'system wide':

<not supported> cpu_core/topdown-bad-spec/

1.003017044 seconds time elapsed

root@number:~# perf stat -a -e cpu_atom/topdown-bad-spec/ sleep 1

Performance counter stats for 'system wide':

19,178,297,593 cpu_atom/topdown-bad-spec/

1.002640873 seconds time elapsed

root@number:~#

It states that it was able to count cpu_core/topdown-bad-spec/ when
calling via the tma_branch_resteers metric, but then if I call it
directly it says its not supported for cpu_core, while works for
cpu_atom, this looks wrong, no?

- Arnaldo