[PATCH v1 0/3] perf trace: Augment struct pointer arguments

From: Howard Chu
Date: Wed Jul 31 2024 - 15:50:05 EST


prerequisite: This series is built on top of the enum augmention series
v5.

This patch series adds augmentation feature to struct pointer, string
and buffer arguments all-in-one. It also fixes 'perf trace -p <PID>',
but unfortunately, it breaks perf trace <Workload>, this will be fixed
in v2.

With this patch series, perf trace will augment struct pointers well, it
can be applied to syscalls such as clone3, epoll_wait, write, and so on.
But unfortunately, it only collects the data once, when syscall enters.
This makes syscalls that pass a pointer in order to let it get
written, not to be augmented very well, I call them the read-like
syscalls, because it reads from the kernel, using the syscall. This
patch series only augments write-like syscalls well.

Unfortunately, there are more read-like syscalls(such as read,
readlinkat, even gettimeofday) than write-like syscalls(write, pwrite64,
epoll_wait, clone3).

Here are three test scripts that I find useful:

pwrite64
```
#include <unistd.h>
#include <sys/syscall.h>

int main()
{
int i1 = 1, i2 = 2, i3 = 3, i4 = 4;
char s1[] = "DI\0NGZ\0HE\1N", s2[] = "XUEBAO";

while (1) {
syscall(SYS_pwrite64, i1, s1, sizeof(s1), i2);
sleep(1);
}

return 0;
}
```

epoll_wait
```
#include <unistd.h>
#include <sys/epoll.h>
#include <stdlib.h>
#include <string.h>

#define MAXEVENTS 2

int main()
{
int i1 = 1, i2 = 2, i3 = 3, i4 = 4;
char s1[] = "DINGZHEN", s2[] = "XUEBAO";

struct epoll_event ee = {
.events = 114,
.data.ptr = NULL,
};

struct epoll_event *events = calloc(MAXEVENTS, sizeof(struct epoll_event));
memcpy(events, &ee, sizeof(ee));

while (1) {
epoll_wait(i1, events, i2, i3);
sleep(1);
}

return 0;
}
```

clone3
```
#include <unistd.h>
#include <sys/syscall.h>
#include <linux/sched.h>
#include <string.h>
#include <stdio.h>
#include <stdlib.h>

int main()
{
int i1 = 1, i2 = 2, i3 = 3, i4 = 4;
char s1[] = "DINGZHEN", s2[] = "XUEBAO";

struct clone_args cla = {
.flags = 1,
.pidfd = 1,
.child_tid = 4,
.parent_tid = 5,
.exit_signal = 1,
.stack = 4,
.stack_size = 1,
.tls = 9,
.set_tid = 1,
.set_tid_size = 9,
.cgroup = 8,
};

while (1) {
syscall(SYS_clone3, &cla, i1);
sleep(1);
}

return 0;
}
```

Please save them, compile and run them, in a separate window, 'ps aux |
grep a.out' to get the pid of them (I'm sorry, but the workload is
broken after the pid fix), and trace them with -p, or, if you want, with
extra -e <syscall-name>. Reminder: for the third script, you can't trace
it with -e clone, you can only trace it with -e clone3.

Although the read-like syscalls augmentation is not fully supported, I
am making significant progress. After lots of debugging, I'm sure I can
implement it in v2.

Howard Chu (3):
perf trace: Set up beauty_map, load it to BPF
perf trace: Collect augmented data using BPF
perf trace: Fix perf trace -p <PID>

tools/perf/builtin-trace.c | 253 +++++++++++++++++-
.../bpf_skel/augmented_raw_syscalls.bpf.c | 121 ++++++++-
tools/perf/util/evlist.c | 14 +-
tools/perf/util/evlist.h | 1 +
tools/perf/util/evsel.c | 3 +
5 files changed, 386 insertions(+), 6 deletions(-)

--
2.45.2