On Thu, Nov 14, 2024 at 04:04:48PM +0000, James Clark wrote:
Since the linked fixes: commit, specifying a CPU on hybrid platforms
results in an error because Perf tries to open an extended type event
on "any" CPU which isn't valid. Extended type events can only be opened
on CPUs that match the type.
Before (working):
$ perf record --cpu 1 -- true
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 2.385 MB perf.data (7 samples) ]
After (not working):
$ perf record -C 1 -- true
WARNING: A requested CPU in '1' is not supported by PMU 'cpu_atom' (CPUs 16-27) for event 'cycles:P'
Error:
The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (cpu_atom/cycles:P/).
/bin/dmesg | grep -i perf may provide additional information.
(Ignore the warning message, that's expected and not particularly
relevant to this issue).
This is because perf_cpu_map__intersect() of the user specified CPU (1)
and one of the PMU's CPUs (16-27) correctly results in an empty (NULL)
CPU map. However for the purposes of opening an event, libperf converts
empty CPU maps into an any CPU (-1) which the kernel rejects.
Fix it by deleting evsels with empty CPU maps in the specific case where
user requested CPU maps are evaluated.
Namhyung, I think this should go via perf-tools,
Before, on an hybrid Intel processor:
⬢ [acme@toolbox perf-tools]$ grep -m1 'model name' /proc/cpuinfo
model name : 13th Gen Intel(R) Core(TM) i7-1365U
⬢ [acme@toolbox perf-tools]$
root@x1:~# perf record -C 1 -- true
WARNING: A requested CPU in '1' is not supported by PMU 'cpu_atom' (CPUs 4-11) for event 'cycles:P'
Error:
The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (cpu_atom/cycles/P).
"dmesg | grep -i perf" may provide additional information.
root@x1:~#
After the patch:
root@x1:~# perf record -C 1 -- true
WARNING: A requested CPU in '1' is not supported by PMU 'cpu_atom' (CPUs 4-11) for event 'cycles:P'
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 1.369 MB perf.data (8 samples) ]
root@x1:~#
root@x1:~# perf evlist -v
cpu_core/cycles/P: type: 0 (PERF_TYPE_HARDWARE), size: 136, config: 0x400000000, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|CPU|PERIOD|IDENTIFIER, read_format: ID|LOST, disabled: 1, freq: 1, precise_ip: 3, sample_id_all: 1
dummy:u: type: 1 (software), size: 136, config: 0x9 (PERF_COUNT_SW_DUMMY), { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|CPU|IDENTIFIER, read_format: ID|LOST, exclude_kernel: 1, exclude_hv: 1, mmap: 1, comm: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1
root@x1:~#
All on CPU 1:
root@x1:~# perf report -D | grep PERF_RECORD_SAMPLE
1 1809541166457609 0x15e410 [0x38]: PERF_RECORD_SAMPLE(IP, 0x4001): 719180/719180: 0xffffffff9b1ce418 period: 1 addr: 0
1 1809541166470019 0x15e448 [0x38]: PERF_RECORD_SAMPLE(IP, 0x4001): 719180/719180: 0xffffffff9a0206b1 period: 1 addr: 0
1 1809541166475543 0x15e480 [0x38]: PERF_RECORD_SAMPLE(IP, 0x4001): 719180/719180: 0xffffffff9a0a9158 period: 5 addr: 0
1 1809541166478316 0x15e4b8 [0x38]: PERF_RECORD_SAMPLE(IP, 0x4001): 719180/719180: 0xffffffff9a0a9158 period: 53 addr: 0
1 1809541166480879 0x15e4f0 [0x38]: PERF_RECORD_SAMPLE(IP, 0x4001): 719180/719180: 0xffffffff9a0a9158 period: 667 addr: 0
1 1809541166483895 0x15e528 [0x38]: PERF_RECORD_SAMPLE(IP, 0x4001): 719180/719180: 0xffffffff9b1bcbcc period: 8537 addr: 0
1 1809541166489116 0x15e560 [0x38]: PERF_RECORD_SAMPLE(IP, 0x4001): 719180/719180: 0xffffffff9a009b3b period: 98429 addr: 0
1 1809541168375508 0x15e620 [0x38]: PERF_RECORD_SAMPLE(IP, 0x4001): 0/0: 0xffffffff9b0e2a34 period: 732596 addr: 0
root@x1:~#
Tested-by: Arnaldo Carvalho de Melo <acme@xxxxxxxxxx>
James, the second patch isn't applying to perf-tools/perf-tools.
- Arnaldo
Fixes: 251aa040244a ("perf parse-events: Wildcard most "numeric" events")
Reviewed-by: Ian Rogers <irogers@xxxxxxxxxx>
Tested-by: Thomas Falcon <thomas.falcon@xxxxxxxxx>
Signed-off-by: James Clark <james.clark@xxxxxxxxxx>
---
tools/lib/perf/evlist.c | 18 ++++++++++++++++--
1 file changed, 16 insertions(+), 2 deletions(-)
diff --git a/tools/lib/perf/evlist.c b/tools/lib/perf/evlist.c
index c6d67fc9e57e..83c43dc13313 100644
--- a/tools/lib/perf/evlist.c
+++ b/tools/lib/perf/evlist.c
@@ -47,6 +47,20 @@ static void __perf_evlist__propagate_maps(struct perf_evlist *evlist,
*/
perf_cpu_map__put(evsel->cpus);
evsel->cpus = perf_cpu_map__intersect(evlist->user_requested_cpus, evsel->own_cpus);
+
+ /*
+ * Empty cpu lists would eventually get opened as "any" so remove
+ * genuinely empty ones before they're opened in the wrong place.
+ */
+ if (perf_cpu_map__is_empty(evsel->cpus)) {
+ struct perf_evsel *next = perf_evlist__next(evlist, evsel);
+
+ perf_evlist__remove(evlist, evsel);
+ /* Keep idx contiguous */
+ if (next)
+ list_for_each_entry_from(next, &evlist->entries, node)
+ next->idx--;
+ }
} else if (!evsel->own_cpus || evlist->has_user_cpus ||
(!evsel->requires_cpu && perf_cpu_map__has_any_cpu(evlist->user_requested_cpus))) {
/*
@@ -80,11 +94,11 @@ static void __perf_evlist__propagate_maps(struct perf_evlist *evlist,
static void perf_evlist__propagate_maps(struct perf_evlist *evlist)
{
- struct perf_evsel *evsel;
+ struct perf_evsel *evsel, *n;
evlist->needs_map_propagation = true;
- perf_evlist__for_each_evsel(evlist, evsel)
+ list_for_each_entry_safe(evsel, n, &evlist->entries, node)
__perf_evlist__propagate_maps(evlist, evsel);
}
--
2.34.1