Re: [PATCH] perf record: skip synthesize event when open evsel failed
From: Shuai Xue
Date: Thu Oct 30 2025 - 22:36:49 EST
在 2025/10/31 01:32, Ian Rogers 写道:
On Wed, Oct 29, 2025 at 5:55 AM Shuai Xue <xueshuai@xxxxxxxxxxxxxxxxx> wrote:
在 2025/10/24 10:45, Shuai Xue 写道:
在 2025/10/24 00:08, Ian Rogers 写道:
On Wed, Oct 22, 2025 at 6:50 PM Shuai Xue <xueshuai@xxxxxxxxxxxxxxxxx> wrote:
When using perf record with the `--overwrite` option, a segmentation fault
occurs if an event fails to open. For example:
perf record -e cycles-ct -F 1000 -a --overwrite
Error:
cycles-ct:H: PMU Hardware doesn't support sampling/overflow-interrupts. Try 'perf stat'
perf: Segmentation fault
#0 0x6466b6 in dump_stack debug.c:366
#1 0x646729 in sighandler_dump_stack debug.c:378
#2 0x453fd1 in sigsegv_handler builtin-record.c:722
#3 0x7f8454e65090 in __restore_rt libc-2.32.so[54090]
#4 0x6c5671 in __perf_event__synthesize_id_index synthetic-events.c:1862
#5 0x6c5ac0 in perf_event__synthesize_id_index synthetic-events.c:1943
#6 0x458090 in record__synthesize builtin-record.c:2075
#7 0x45a85a in __cmd_record builtin-record.c:2888
#8 0x45deb6 in cmd_record builtin-record.c:4374
#9 0x4e5e33 in run_builtin perf.c:349
#10 0x4e60bf in handle_internal_command perf.c:401
#11 0x4e6215 in run_argv perf.c:448
#12 0x4e653a in main perf.c:555
#13 0x7f8454e4fa72 in __libc_start_main libc-2.32.so[3ea72]
#14 0x43a3ee in _start ??:0
The --overwrite option implies --tail-synthesize, which collects non-sample
events reflecting the system status when recording finishes. However, when
evsel opening fails (e.g., unsupported event 'cycles-ct'), session->evlist
is not initialized and remains NULL. The code unconditionally calls
record__synthesize() in the error path, which iterates through the NULL
evlist pointer and causes a segfault.
To fix it, move the record__synthesize() call inside the error check block, so
it's only called when there was no error during recording, ensuring that evlist
is properly initialized.
Fixes: 4ea648aec019 ("perf record: Add --tail-synthesize option")
Signed-off-by: Shuai Xue <xueshuai@xxxxxxxxxxxxxxxxx>
This looks great! I wonder if we can add a test, perhaps here:
https://web.git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/tests/shell/record.sh?h=perf-tools-next#n435
something like:
```
$ perf record -e foobar -F 1000 -a --overwrite -o /dev/null -- sleep 0.1
```
in a new test subsection for test_overwrite? foobar would be an event
that we could assume isn't present. Could you help with a test
covering the problems you've uncovered and perhaps related flags?
Hi, Ian,
Good suggestion, I'd like to add a test. But foobar may not a good case.
Regarding your example:
perf record -e foobar -a --overwrite -o /dev/null -- sleep 0.1
event syntax error: 'foobar'
\___ Bad event name
Unable to find event on a PMU of 'foobar'
Run 'perf list' for a list of valid events
Usage: perf record [<options>] [<command>]
or: perf record [<options>] -- <command> [<options>]
-e, --event <event> event selector. use 'perf list' to list available events
The issue with using foobar is that it's an invalid event name, and the
perf parser will reject it much earlier. This means the test would exit
before reaching the part of the code path we want to verify (where
record__synthesize() could be called).
A potential alternative could be testing an error case such as EACCES:
perf record -e cycles -C 0 --overwrite -o /dev/null -- sleep 0.1
This could reproduce the scenario of a failure when attempting to access
a valid event, such as due to permission restrictions. However, the
limitation here is that users may override
/proc/sys/kernel/perf_event_paranoid, which affects whether or not this
test would succeed in triggering an EACCES error.
If you have any other suggestions or ideas for a better way to simulate
this situation, I'd love to hear them.
Thanks.
Shuai
Hi, Ian,
Gentle ping.
Sorry, for the delay. I was trying to think of a better way given the
problems you mention and then got distracted. I wonder if a legacy
event that core PMUs never implement would be a good candidate to
test. For example, the event "node-prefetch-misses" is for "Local
memory prefetch misses" but the memory controller tends to be a
separate PMU and this event is never implemented to my knowledge.
Running this locally I see:
```
$ perf record -e node-prefetch-misses -a --overwrite -o /dev/null -- sleep 0.1
Lowering default frequency rate from 4000 to 1750.
Please consider tweaking /proc/sys/kernel/perf_event_max_sample_rate.
Error:
Failure to open event 'cpu_atom/node-prefetch-misses/' on PMU
'cpu_atom' which will be removed.
No fallback found for 'cpu_atom/node-prefetch-misses/' for error 2
Error:
Failure to open event 'cpu_core/node-prefetch-misses/' on PMU
'cpu_core' which will be removed.
No fallback found for 'cpu_core/node-prefetch-misses/' for error 2
Error:
Failure to open any events for recording.
perf: Segmentation fault
#0 0x55a487ad8b87 in dump_stack debug.c:366
#1 0x55a487ad8bfd in sighandler_dump_stack debug.c:378
#2 0x55a4878c6f94 in sigsegv_handler builtin-record.c:722
#3 0x7f72aae49df0 in __restore_rt libc_sigaction.c:0
#4 0x55a487b57ef8 in __perf_event__synthesize_id_index
synthetic-events.c:1862
#5 0x55a487b58346 in perf_event__synthesize_id_index synthetic-events.c:1943
#6 0x55a4878cb2a3 in record__synthesize builtin-record.c:2150
#7 0x55a4878cdada in __cmd_record builtin-record.c:2963
#8 0x55a4878d11ca in cmd_record builtin-record.c:4453
#9 0x55a48795b3cc in run_builtin perf.c:349
#10 0x55a48795b664 in handle_internal_command perf.c:401
#11 0x55a48795b7bd in run_argv perf.c:448
#12 0x55a48795bb06 in main perf.c:555
#13 0x7f72aae33ca8 in __libc_start_call_main libc_start_call_main.h:74
#14 0x7f72aae33d65 in __libc_start_main_alias_2 libc-start.c:128
#15 0x55a4878acf41 in _start perf[52f41]
Segmentation fault
```
Hi, Ian,
Is node-prefetch-misses a platform specific event? Running it on ARM Yitian 710
and Intel SPR platform, I see:
$sudo perf record -e node-prefetch-misses
Error:
The node-prefetch-misses event is not supported.
Thanks.
Shuai