Re: [PATCH v2 7/7] tools/testing/selftests/bpf: replace open-coded 16 with TASK_COMM_LEN
From: John Stultz
Date: Wed Feb 08 2023 - 19:54:22 EST
On Wed, Feb 8, 2023 at 4:11 PM Alexei Starovoitov
<alexei.starovoitov@xxxxxxxxx> wrote:
>
> On Wed, Feb 8, 2023 at 2:01 PM John Stultz <jstultz@xxxxxxxxxx> wrote:
> >
> > On Sat, Nov 20, 2021 at 11:27:38AM +0000, Yafang Shao wrote:
> > > As the sched:sched_switch tracepoint args are derived from the kernel,
> > > we'd better make it same with the kernel. So the macro TASK_COMM_LEN is
> > > converted to type enum, then all the BPF programs can get it through BTF.
> > >
> > > The BPF program which wants to use TASK_COMM_LEN should include the header
> > > vmlinux.h. Regarding the test_stacktrace_map and test_tracepoint, as the
> > > type defined in linux/bpf.h are also defined in vmlinux.h, so we don't
> > > need to include linux/bpf.h again.
> > >
> > > Signed-off-by: Yafang Shao <laoar.shao@xxxxxxxxx>
> > > Acked-by: Andrii Nakryiko <andrii@xxxxxxxxxx>
> > > Acked-by: David Hildenbrand <david@xxxxxxxxxx>
> > > Cc: Mathieu Desnoyers <mathieu.desnoyers@xxxxxxxxxxxx>
> > > Cc: Arnaldo Carvalho de Melo <arnaldo.melo@xxxxxxxxx>
> > > Cc: Andrii Nakryiko <andrii.nakryiko@xxxxxxxxx>
> > > Cc: Michal Miroslaw <mirq-linux@xxxxxxxxxxxx>
> > > Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
> > > Cc: Steven Rostedt <rostedt@xxxxxxxxxxx>
> > > Cc: Matthew Wilcox <willy@xxxxxxxxxxxxx>
> > > Cc: David Hildenbrand <david@xxxxxxxxxx>
> > > Cc: Al Viro <viro@xxxxxxxxxxxxxxxxxx>
> > > Cc: Kees Cook <keescook@xxxxxxxxxxxx>
> > > Cc: Petr Mladek <pmladek@xxxxxxxx>
> > > ---
> > > include/linux/sched.h | 9 +++++++--
> > > tools/testing/selftests/bpf/progs/test_stacktrace_map.c | 6 +++---
> > > tools/testing/selftests/bpf/progs/test_tracepoint.c | 6 +++---
> > > 3 files changed, 13 insertions(+), 8 deletions(-)
> >
> > Hey all,
> > I know this is a little late, but I recently got a report that
> > this change was causiing older versions of perfetto to stop
> > working.
> >
> > Apparently newer versions of perfetto has worked around this
> > via the following changes:
> > https://android.googlesource.com/platform/external/perfetto/+/c717c93131b1b6e3705a11092a70ac47c78b731d%5E%21/
> > https://android.googlesource.com/platform/external/perfetto/+/160a504ad5c91a227e55f84d3e5d3fe22af7c2bb%5E%21/
> >
> > But for older versions of perfetto, reverting upstream commit
> > 3087c61ed2c4 ("tools/testing/selftests/bpf: replace open-coded 16
> > with TASK_COMM_LEN") is necessary to get it back to working.
> >
> > I haven't dug very far into the details, and obviously this doesn't
> > break with the updated perfetto, but from a high level this does
> > seem to be a breaking-userland regression.
> >
> > So I wanted to reach out to see if there was more context for this
> > breakage? I don't want to raise a unnecessary stink if this was
> > an unfortuante but forced situation.
>
> Let me understand what you're saying...
>
> The commit 3087c61ed2c4 did
>
> -/* Task command name length: */
> -#define TASK_COMM_LEN 16
> +/*
> + * Define the task command name length as enum, then it can be visible to
> + * BPF programs.
> + */
> +enum {
> + TASK_COMM_LEN = 16,
> +};
>
>
> and that caused:
>
> cat /sys/kernel/debug/tracing/events/task/task_newtask/format
>
> to print
> field:char comm[TASK_COMM_LEN]; offset:12; size:16; signed:0;
> instead of
> field:char comm[16]; offset:12; size:16; signed:0;
>
> so the ftrace parsing android tracing tool had to do:
>
> - if (Match(type_and_name.c_str(), R"(char [a-zA-Z_]+\[[0-9]+\])")) {
> + if (Match(type_and_name.c_str(),
> + R"(char [a-zA-Z_][a-zA-Z_0-9]*\[[a-zA-Z_0-9]+\])")) {
>
> to workaround this change.
> Right?
I believe so.
> And what are you proposing?
I'm not proposing anything. I was just wanting to understand more
context around this, as it outwardly appears to be a user-breaking
change, and that is usually not done, so I figured it was an issue
worth raising.
If the debug/tracing/*/format output is in the murky not-really-abi
space, that's fine, but I wanted to know if this was understood as
something that may require userland updates or if this was a
unexpected side-effect.
thanks
-john