Re: [PATCHES 0/8] perf tools: Diagnostic offsets in skip messages + two hardening fixes

From: Arnaldo Carvalho de Melo

Date: Wed Jun 03 2026 - 15:31:58 EST


On Wed, Jun 03, 2026 at 08:06:48AM -0700, Ian Rogers wrote:
> On Tue, Jun 2, 2026 at 4:57 PM Arnaldo Carvalho de Melo <acme@xxxxxxxxxx> wrote:
> >
> > When perf report, perf sched, or perf timechart skip a malformed or
> > unprocessable event, the warning message doesn't say where in the
> > perf.data file the problem occurred. This makes it hard to
> > cross-reference with 'perf report -D' output or to locate the
> > corrupted region with a hex editor.
> >
> > This series adds a file_offset field to struct perf_sample, set in the
> > event delivery path (including the deferred callchain path), and
> > retrofits all skip/stop/error messages to include:
> >
> > - The file offset where the event was found
> > - The event type name via perf_event__name() with the numeric
> > type value in parentheses
> >
> > For example, instead of:
> >
> > problem processing 10 event, skipping it.
> >
> > a user now sees:
> >
> > WARNING: at offset 0x1a3f0: MMAP2 (10) event size 24 too small (min 64), skipping
> >
> > The peek_event() path, which validates events during initial file
> > scanning, also gains file offsets in its three warning messages
> > (misaligned size, unsupported type, undersized event).
> >
> > Two pre-existing bugs found by sashiko-bot are fixed:
> >
> > - builtin-timechart.c cat_backtrace(): use-after-free and
> > double-free when an invalid callchain context triggers zfree()
> > before fclose() on an open_memstream buffer. The open_memstream
> > contract requires fclose() before the buffer can be freed — see
> > open_memstream(3).
>
> Fwiw, I've also been around the timechart code prompted by AI review
> and also trying to clean up tests with address sanitizer:
> https://lore.kernel.org/linux-perf-users/agzWqrn6XPEwTAsb@xxxxxxxxxx/

Thanks for all the reviews, I'll merge this series since sashiko found
just one endianess issue with the new 'perf test' entry and the other
comments are for pre-existing problems that we've added to TODO lists,
then you can rebase that timechart leaks on top of it, ok?

- Arnaldo

> Thanks,
> Ian
>
> > - builtin-sched.c: three BUG_ON(cpu >= MAX_CPUS || cpu < 0)
> > that abort perf sched when PERF_SAMPLE_CPU is absent from the
> > sample type and the CPU sentinel (u32)-1 is cast to signed -1.
> > perf.data is untrusted input — a corrupted or truncated file
> > should produce a warning, not an abort.
> >
> > Arnaldo Carvalho de Melo (8):
> > perf sample: Add file_offset field to struct perf_sample
> > perf session: Include file offset in event skip/stop messages
> > perf sched: Include file offset in event skip messages
> > perf timechart: Include file offset in CPU bounds check messages
> > perf tools: Include file offset and event type name in skip messages
> > perf timechart: Fix cat_backtrace() use-after-free on corrupted callchain
> > perf sched: Replace BUG_ON on invalid CPU with graceful skip
> > perf test: Add file offset diagnostic test for corrupted perf.data
> >
> > 15 files changed, 261 insertions(+), 101 deletions(-)
> >
> > Developed with AI assistance (Claude/sashiko), tagged in commits.
> >
> > Best regards,
> >
> > - Arnaldo