Re: [PATCH] perf trace: Fix SIGSEGV when processing augmented args

From: Naveen N. Rao
Date: Tue Mar 15 2022 - 13:28:14 EST


Arnaldo Carvalho de Melo wrote:
Em Thu, Mar 10, 2022 at 04:17:41PM +0530, Naveen N. Rao escreveu:
On powerpc, 'perf trace' is crashing with a SIGSEGV when trying to
process a perf data file created with 'perf trace record -p':

#0 0x00000001225b8988 in syscall_arg__scnprintf_augmented_string <snip> at builtin-trace.c:1492
#1 syscall_arg__scnprintf_filename <snip> at builtin-trace.c:1492
#2 syscall_arg__scnprintf_filename <snip> at builtin-trace.c:1486
#3 0x00000001225bdd9c in syscall_arg_fmt__scnprintf_val <snip> at builtin-trace.c:1973
#4 syscall__scnprintf_args <snip> at builtin-trace.c:2041
#5 0x00000001225bff04 in trace__sys_enter <snip> at builtin-trace.c:2319

The size captured in the augmented arg looks corrupt, resulting in the
augmented arg pointer being adjusted incorrectly. Fix this by checking
that the size is reasonable.

Reported-by: Claudio Carvalho <cclaudio@xxxxxxxxxxxxx>
Signed-off-by: Naveen N. Rao <naveen.n.rao@xxxxxxxxxxxxxxxxxx>
---
While this resolves the 'perf trace' crash, I'm not yet sure why the
size for the augmented arg is corrupt. This looks to be happening when
processing the sample for 'read' syscall. Any pointers?

Strange indeed, the augmented args code should kick in when the payload
for some tracepoint is bigger than what is expected, i.e. more than the
sum of its arguments, in which case it assumes that what is coming after
the expected payload comes with, say, the pathname for an open, openat,
etc syscall, that otherwise would be just a pointer.

This augmentation is done using something like
tools/perf/examples/bpf/augmented_raw_syscalls.c, i.e.:

[root@quaco ~]# perf trace -e openat sleep 1
0.000 openat(dfd: CWD, filename: 0x1bbb0b6, flags: RDONLY|CLOEXEC) = 3
0.021 openat(dfd: CWD, filename: 0x1bc8e80, flags: RDONLY|CLOEXEC) = 3
0.201 openat(dfd: CWD, filename: 0x1b34f20, flags: RDONLY|CLOEXEC) = 3
[root@quaco ~]# perf trace -e ~acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.c,openat sleep 1
0.000 openat(dfd: CWD, filename: "/etc/ld.so.cache", flags: RDONLY|CLOEXEC) = 3
0.023 openat(dfd: CWD, filename: "/lib64/libc.so.6", flags: RDONLY|CLOEXEC) = 3
0.225 openat(dfd: CWD, filename: "/usr/lib/locale/locale-archive", flags: RDONLY|CLOEXEC) = 3
[root@quaco ~]#

Thanks, that clarifies things a bit. I was under the impression that syscalls are augmented through the bpf program by default. But, it looks like that isn't the case (at least on the distro where this problem was reported).


So it seems the perf.data file being processed is corrupted somehow...

Perhaps we should check if the syscall has pointer args and if not don't
kick in the augmented code and instead emit some warning about having
more payload than expected.

Yes, it looks like the current check in 'perf' isn't working. The below patch also resolves the crash we are seeing:

diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index 2f1d20553a0aa3..86b459f4ebdd61 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -2326,7 +2326,7 @@ static int trace__sys_enter(struct trace *trace, struct evsel *evsel,
* thinking that the extra 2 u64 args are the augmented filename, so just check
* here and avoid using augmented syscalls when the evsel is the raw_syscalls one.
*/
- if (evsel != trace->syscalls.events.sys_enter)
+ if (strcmp(evsel__name(evsel), "raw_syscalls:sys_enter"))
augmented_args = syscall__augmented_args(sc, sample, &augmented_args_size, trace->raw_augmented_syscalls_args_size);
ttrace->entry_time = sample->time;
msg = ttrace->entry_str;


Thanks,
Naveen