Re: perf 6.9-1 (archlinux) crashes during recording of cycles + raw_syscalls

From: Ian Rogers
Date: Tue Jun 04 2024 - 14:48:33 EST


On Tue, Jun 4, 2024 at 7:12 AM Arnaldo Carvalho de Melo <acme@xxxxxxxxxx> wrote:
>
> On Tue, Jun 04, 2024 at 10:50:00AM -0300, Arnaldo Carvalho de Melo wrote:
> > On Tue, Jun 04, 2024 at 01:44:18PM +0200, Milian Wolff wrote:
> > > the following command crashes perf for me, is this still an issue upstream and
> > > should I report it to arch for backporting? I cannot build the official perf/
> > > core branch, so I am afraid I cannot test it myself there:
>
> > > ```
> > > sudo /usr/bin/perf record -z --call-graph dwarf -e cycles -e
> > > raw_syscalls:sys_enter ls
> > > ...
> > > [ perf record: Woken up 3 times to write data ]
> > > malloc(): invalid next size (unsorted)
> > > Aborted
> > > ```
> > > Backtrace with GDB + debuginfod:
> > > ```
> > > malloc(): invalid next size (unsorted)
>
> > I reproduced this all the way back to 6.8, trying to bisect now, thanks
> > for the report,
>
> Can you please try with the attached and perhaps provide your Tested-by?
>
> Thanks,
>
> - Arnaldo
>
> From ab355e2c6b4cf641a9fff7af38059cf69ac712d5 Mon Sep 17 00:00:00 2001
> From: Arnaldo Carvalho de Melo <acme@xxxxxxxxxx>
> Date: Tue, 4 Jun 2024 11:00:22 -0300
> Subject: [PATCH 1/1] Revert "perf record: Reduce memory for recording
> PERF_RECORD_LOST_SAMPLES event"
>
> This reverts commit 7d1405c71df21f6c394b8a885aa8a133f749fa22.

I think we should try to fight back reverts when possible. Reverts are
removing something somebody poured time and attention into. When a
regression has occurred then I think we should add the regression case
as a test.

> This causes segfaults in some cases, as reported by Milian:
>
> ```
> sudo /usr/bin/perf record -z --call-graph dwarf -e cycles -e
> raw_syscalls:sys_enter ls
> ...
> [ perf record: Woken up 3 times to write data ]
> malloc(): invalid next size (unsorted)
> Aborted
> ```
>
> Backtrace with GDB + debuginfod:
>
> ```
> malloc(): invalid next size (unsorted)
>
> Thread 1 "perf" received signal SIGABRT, Aborted.
> __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6,
> no_tid=no_tid@entry=0) at pthread_kill.c:44
> Downloading source file /usr/src/debug/glibc/glibc/nptl/pthread_kill.c
> 44 return INTERNAL_SYSCALL_ERROR_P (ret) ? INTERNAL_SYSCALL_ERRNO
> (ret) : 0;
> (gdb) bt
> #0 __pthread_kill_implementation (threadid=<optimized out>,
> signo=signo@entry=6, no_tid=no_tid@entry=0) at pthread_kill.c:44
> #1 0x00007ffff6ea8eb3 in __pthread_kill_internal (threadid=<optimized out>,
> signo=6) at pthread_kill.c:78
> #2 0x00007ffff6e50a30 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/
> raise.c:26
> #3 0x00007ffff6e384c3 in __GI_abort () at abort.c:79
> #4 0x00007ffff6e39354 in __libc_message_impl (fmt=fmt@entry=0x7ffff6fc22ea
> "%s\n") at ../sysdeps/posix/libc_fatal.c:132
> #5 0x00007ffff6eb3085 in malloc_printerr (str=str@entry=0x7ffff6fc5850
> "malloc(): invalid next size (unsorted)") at malloc.c:5772
> #6 0x00007ffff6eb657c in _int_malloc (av=av@entry=0x7ffff6ff6ac0
> <main_arena>, bytes=bytes@entry=368) at malloc.c:4081
> #7 0x00007ffff6eb877e in __libc_calloc (n=<optimized out>,
> elem_size=<optimized out>) at malloc.c:3754
> #8 0x000055555569bdb6 in perf_session.do_write_header ()
> #9 0x00005555555a373a in __cmd_record.constprop.0 ()
> #10 0x00005555555a6846 in cmd_record ()
> #11 0x000055555564db7f in run_builtin ()
> #12 0x000055555558ed77 in main ()
> ```
>
> Valgrind memcheck:
> ```
> ==45136== Invalid write of size 8
> ==45136== at 0x2B38A5: perf_event__synthesize_id_sample (in /usr/bin/perf)

The write that is happening here is because an empty sample except for
the id value is written out:
https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/util/synthetic-events.c?h=perf-tools-next#n1805
https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/builtin-record.c?h=perf-tools-next#n1910
It'd been missed that the sample wasn't quite empty and needed space.

Rather than making the space 64kb again, can we not use
perf_event__sample_event_size or some max constant?

Thanks,
Ian

> ==45136== by 0x157069: __cmd_record.constprop.0 (in /usr/bin/perf)
> ==45136== by 0x15A845: cmd_record (in /usr/bin/perf)
> ==45136== by 0x201B7E: run_builtin (in /usr/bin/perf)
> ==45136== by 0x142D76: main (in /usr/bin/perf)
> ==45136== Address 0x6a866a8 is 0 bytes after a block of size 40 alloc'd
> ==45136== at 0x4849BF3: calloc (vg_replace_malloc.c:1675)
> ==45136== by 0x3574AB: zalloc (in /usr/bin/perf)
> ==45136== by 0x1570E0: __cmd_record.constprop.0 (in /usr/bin/perf)
> ==45136== by 0x15A845: cmd_record (in /usr/bin/perf)
> ==45136== by 0x201B7E: run_builtin (in /usr/bin/perf)
> ==45136== by 0x142D76: main (in /usr/bin/perf)
> ==45136==
> ==45136== Syscall param write(buf) points to unaddressable byte(s)
> ==45136== at 0x575953D: __libc_write (write.c:26)
> ==45136== by 0x575953D: write (write.c:24)
> ==45136== by 0x35761F: ion (in /usr/bin/perf)
> ==45136== by 0x357778: writen (in /usr/bin/perf)
> ==45136== by 0x1548F7: record__write (in /usr/bin/perf)
> ==45136== by 0x15708A: __cmd_record.constprop.0 (in /usr/bin/perf)
> ==45136== by 0x15A845: cmd_record (in /usr/bin/perf)
> ==45136== by 0x201B7E: run_builtin (in /usr/bin/perf)
> ==45136== by 0x142D76: main (in /usr/bin/perf)
> ==45136== Address 0x6a866a8 is 0 bytes after a block of size 40 alloc'd
> ==45136== at 0x4849BF3: calloc (vg_replace_malloc.c:1675)
> ==45136== by 0x3574AB: zalloc (in /usr/bin/perf)
> ==45136== by 0x1570E0: __cmd_record.constprop.0 (in /usr/bin/perf)
> ==45136== by 0x15A845: cmd_record (in /usr/bin/perf)
> ==45136== by 0x201B7E: run_builtin (in /usr/bin/perf)
> ==45136== by 0x142D76: main (in /usr/bin/perf)
> ==45136==
> -----
>
> Closes: https://lore.kernel.org/linux-perf-users/23879991.0LEYPuXRzz@milian-workstation/
> Reported-by: Milian Wolff <milian.wolff@xxxxxxxx>
> Cc: Adrian Hunter <adrian.hunter@xxxxxxxxx>
> Cc: Ian Rogers <irogers@xxxxxxxxxx>
> Cc: Jiri Olsa <jolsa@xxxxxxxxxx>
> Cc: Kan Liang <kan.liang@xxxxxxxxxxxxxxx>
> Cc: Namhyung Kim <namhyung@xxxxxxxxxx>
> Cc: stable@xxxxxxxxxx # 6.8+
> Signed-off-by: Arnaldo Carvalho de Melo <acme@xxxxxxxxxx>
> ---
> tools/perf/builtin-record.c | 6 ++----
> 1 file changed, 2 insertions(+), 4 deletions(-)
>
> diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
> index 66a3de8ac66186b0..0a8ba1323d64be6b 100644
> --- a/tools/perf/builtin-record.c
> +++ b/tools/perf/builtin-record.c
> @@ -1956,8 +1956,7 @@ static void record__read_lost_samples(struct record *rec)
>
> if (count.lost) {
> if (!lost) {
> - lost = zalloc(sizeof(*lost) +
> - session->machines.host.id_hdr_size);
> + lost = zalloc(PERF_SAMPLE_MAX_SIZE);
> if (!lost) {
> pr_debug("Memory allocation failed\n");
> return;
> @@ -1973,8 +1972,7 @@ static void record__read_lost_samples(struct record *rec)
> lost_count = perf_bpf_filter__lost_count(evsel);
> if (lost_count) {
> if (!lost) {
> - lost = zalloc(sizeof(*lost) +
> - session->machines.host.id_hdr_size);
> + lost = zalloc(PERF_SAMPLE_MAX_SIZE);
> if (!lost) {
> pr_debug("Memory allocation failed\n");
> return;
> --
> 2.45.1
>