Re: [PATCH v2 1/4] perf trace: Exit when build eBPF program failure
From: Leo Yan
Date: Thu Jun 06 2019 - 10:00:55 EST
Hi Arnaldo,
On Thu, Jun 06, 2019 at 10:30:19AM -0300, Arnaldo Carvalho de Melo wrote:
> Em Thu, Jun 06, 2019 at 05:48:42PM +0800, Leo Yan escreveu:
> > On my Juno board with ARM64 CPUs, perf trace command reports the eBPF
> > program building failure but the command will not exit and continue to
> > run. If we define an eBPF event in config file, the event will be
> > parsed with below flow:
> >
> > perf_config()
> > `> trace__config()
> > `> parse_events_option()
> > `> parse_events__scanner()
> > `-> parse_events_parse()
> > `> parse_events_load_bpf()
> > `> llvm__compile_bpf()
> >
> > Though the low level functions return back error values when detect eBPF
> > building failure, but parse_events_option() returns 1 for this case and
>
> (gdb) n
> parse_events__scanner (str=0xb9d170 "/home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o", parse_state=0x7fffffff7fa0,
> start_token=258) at util/parse-events.c:1870
> 1870 parse_events__delete_buffer(buffer, scanner);
> (gdb) n
> 1871 parse_events_lex_destroy(scanner);
> (gdb) n
> 1872 return ret;
> (gdb) p ret
> $53 = 1
> (gdb) bt
> #0 parse_events__scanner (str=0xb9d170 "/home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o", parse_state=0x7fffffff7fa0,
> start_token=258) at util/parse-events.c:1872
> #1 0x000000000050a926 in parse_events (evlist=0xb9e5d0, str=0xb9d170 "/home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o",
> err=0x7fffffff8020) at util/parse-events.c:1907
> #2 0x000000000050ad94 in parse_events_option (opt=0x7fffffff8080,
> str=0xb9d170 "/home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o", unset=0) at util/parse-events.c:2007
> #3 0x0000000000497fa8 in trace__config (var=0x7fffffff8150 "trace.add_events",
> value=0xb9d170 "/home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o", arg=0x7fffffffa1c0) at builtin-trace.c:3706
> #4 0x00000000004e9a79 in perf_config (fn=0x497ee4 <trace__config>, data=0x7fffffffa1c0) at util/config.c:738
> #5 0x0000000000498c97 in cmd_trace (argc=2, argv=0x7fffffffd690) at builtin-trace.c:3865
> #6 0x00000000004d8c17 in run_builtin (p=0xa0e600 <commands+576>, argc=2, argv=0x7fffffffd690) at perf.c:303
> #7 0x00000000004d8e84 in handle_internal_command (argc=2, argv=0x7fffffffd690) at perf.c:355
> #8 0x00000000004d8fd3 in run_argv (argcp=0x7fffffffd4ec, argv=0x7fffffffd4e0) at perf.c:399
> #9 0x00000000004d933f in main (argc=2, argv=0x7fffffffd690) at perf.c:521
> (gdb)
>
> So its parse_events__scanner() that returns 1, parse_events() propagate
> that and:
>
> parse_events_option (opt=0x7fffffff8080, str=0xb9d170 "/home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o", unset=0)
> at util/parse-events.c:2009
> 2009 if (ret) {
> (gdb) p ret
> $56 = 1
> (gdb) n
> 2010 parse_events_print_error(&err, str);
> (gdb) n
> event syntax error: '/home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o'
> \___ Kernel verifier blocks program loading
>
> (add -v to see detail)
> 2011 fprintf(stderr, "Run 'perf list' for a list of valid events\n");
> (gdb)
>
> So the -4007 error is printed, and all we can say is that parsing events
> failed, but we end up not propagating that error back when we use
> parse_events_option(), we could use instead:
>
> struct parse_events_error err = { .idx = 0, };
> int ret = parse_events(evlist, str, &err);
>
> And make parse_events_error have the raw err, i.e. -4007 in this case:
>
> [ERRCODE_OFFSET(VERIFY)] = "Kernel verifier blocks program loading",
>
> In your case would be something else, I'm just trying to load the
> precompiled .o that does things the BPF kernel verifier doesn't like.
Yes, exactly. My failure is compilation failure but not BPF verifier
failure.
> So yeah, your patch looks ok, i.e. parse_events_option() returning !0
> should make trace__config() return -1.
>
> But see below:
>
> - Arnaldo
>
> > trace__config() passes 1 to perf_config(); perf_config() doesn't treat
> > the returned value 1 as failure and it continues to parse other
> > configurations. Thus the perf command continues to run even without
> > enabling eBPF event successfully.
> >
> > This patch changes error handling in trace__config(), when it detects
> > failure it will return -1 rather than directly pass error value (1);
> > finally, perf_config() will directly bail out and perf will exit for
> > this case.
> >
> > Signed-off-by: Leo Yan <leo.yan@xxxxxxxxxx>
> > ---
> > tools/perf/builtin-trace.c | 8 ++++++++
> > 1 file changed, 8 insertions(+)
> >
> > diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
> > index 54b2d0fd0d02..4b5d004aab74 100644
> > --- a/tools/perf/builtin-trace.c
> > +++ b/tools/perf/builtin-trace.c
> > @@ -3664,6 +3664,14 @@ static int trace__config(const char *var, const char *value, void *arg)
> > "event selector. use 'perf list' to list available events",
> > parse_events_option);
> > err = parse_events_option(&o, value, 0);
> > +
> > + /*
> > + * When parse option successfully parse_events_option() will
> > + * return 0, otherwise means the paring failure. And it
> > + * returns 1 for eBPF program building failure; so adjust the
> > + * err value to -1 for the failure.
> > + */
> > + err = err ? -1 : 0;
>
> I'll rewrite the comment above to make it more succint and fix things
> like 'paring' (parsing):
>
> /*
> * parse_events_option() returns !0 to indicate failure
> * while the perf_config code that calls trace__config()
> * expects < 0 returns to indicate error, so:
> */
>
> if (err)
> err = -1;
This looks good to me. Thanks a lot for the reviewing.
Leo.
> > } else if (!strcmp(var, "trace.show_timestamp")) {
> > trace->show_tstamp = perf_config_bool(var, value);
> > } else if (!strcmp(var, "trace.show_duration")) {
> > --
> > 2.17.1
>
> --
>
> - Arnaldo