Re: [PATCH v1] perf trace: Fix BPF loading failure (-E2BIG)

From: Namhyung Kim
Date: Wed Dec 18 2024 - 19:45:48 EST


Hello,

On Thu, Dec 12, 2024 at 06:30:47PM -0800, Howard Chu wrote:
> As reported by Namhyung Kim and acknowledged by Qiao Zhao (link:
> https://lore.kernel.org/linux-perf-users/20241206001436.1947528-1-namhyung@xxxxxxxxxx/),
> on certain machines, perf trace failed to load the BPF program into the
> kernel. The verifier runs perf trace's BPF program for up to 1 million
> instructions, returning an E2BIG error, whereas the perf trace BPF
> program should be much less complex than that. This patch aims to fix
> the issue described above.
>
> The E2BIG problem from clang-15 to clang-16 is cause by this line:
> } else if (size < 0 && size >= -6) { /* buffer */
>
> Specifically this check: size < 0. seems like clang generates a cool
> optimization to this sign check that breaks things.
>
> Making 'size' s64, and use
> } else if ((int)size < 0 && size >= -6) { /* buffer */
>
> Solves the problem. This is some Hogwarts magic.
>
> And the unbounded access of clang-12 and clang-14 (clang-13 works this
> time) is fixed by making variable 'aug_size' s64.
>
> As for this:
> -if (aug_size > TRACE_AUG_MAX_BUF)
> - aug_size = TRACE_AUG_MAX_BUF;
> +aug_size = args->args[index] > TRACE_AUG_MAX_BUF ? TRACE_AUG_MAX_BUF : args->args[index];
>
> This makes the BPF skel generated by clang-18 work. Yes, new clangs
> introduce problems too.
>
> Sorry, I only know that it works, but I don't know how it works. I'm not
> an expert in the BPF verifier. I really hope this is not a kernel
> version issue, as that would make the test case (kernel_nr) *
> (clang_nr), a true horror story. I will test it on more kernel versions
> in the future.
>
> Fixes: 395d38419f18: ("perf trace augmented_raw_syscalls: Add more check s to pass the verifier")
> Reported-by: Namhyung Kim <namhyung@xxxxxxxxxx>
> Signed-off-by: Howard Chu <howardchu95@xxxxxxxxx>

Thanks for the fix!

Tested-by: Namhyung Kim <namhyung@xxxxxxxxxx>

> ---
> tools/perf/util/bpf_skel/augmented_raw_syscalls.bpf.c | 11 ++++-------
> 1 file changed, 4 insertions(+), 7 deletions(-)
>
> diff --git a/tools/perf/util/bpf_skel/augmented_raw_syscalls.bpf.c b/tools/perf/util/bpf_skel/augmented_raw_syscalls.bpf.c
> index 4a62ed593e84..e4352881e3fa 100644
> --- a/tools/perf/util/bpf_skel/augmented_raw_syscalls.bpf.c
> +++ b/tools/perf/util/bpf_skel/augmented_raw_syscalls.bpf.c
> @@ -431,9 +431,9 @@ static bool pid_filter__has(struct pids_filtered *pids, pid_t pid)
> static int augment_sys_enter(void *ctx, struct syscall_enter_args *args)
> {
> bool augmented, do_output = false;
> - int zero = 0, size, aug_size, index,
> - value_size = sizeof(struct augmented_arg) - offsetof(struct augmented_arg, value);
> + int zero = 0, index, value_size = sizeof(struct augmented_arg) - offsetof(struct augmented_arg, value);
> u64 output = 0; /* has to be u64, otherwise it won't pass the verifier */
> + s64 aug_size, size;
> unsigned int nr, *beauty_map;
> struct beauty_payload_enter *payload;
> void *arg, *payload_offset;
> @@ -484,14 +484,11 @@ static int augment_sys_enter(void *ctx, struct syscall_enter_args *args)
> } else if (size > 0 && size <= value_size) { /* struct */
> if (!bpf_probe_read_user(((struct augmented_arg *)payload_offset)->value, size, arg))
> augmented = true;
> - } else if (size < 0 && size >= -6) { /* buffer */
> + } else if ((int)size < 0 && size >= -6) { /* buffer */
> index = -(size + 1);
> barrier_var(index); // Prevent clang (noticed with v18) from removing the &= 7 trick.
> index &= 7; // Satisfy the bounds checking with the verifier in some kernels.
> - aug_size = args->args[index];
> -
> - if (aug_size > TRACE_AUG_MAX_BUF)
> - aug_size = TRACE_AUG_MAX_BUF;
> + aug_size = args->args[index] > TRACE_AUG_MAX_BUF ? TRACE_AUG_MAX_BUF : args->args[index];
>
> if (aug_size > 0) {
> if (!bpf_probe_read_user(((struct augmented_arg *)payload_offset)->value, aug_size, arg))
> --
> 2.43.0
>