Re: [PATCH v1 2/2] perf script: Fix for `perf script +F metric` with leader sampling

From: Ian Rogers
Date: Tue Jul 23 2024 - 10:57:52 EST


On Tue, Jul 23, 2024 at 7:41 AM James Clark <james.clark@xxxxxxxxxx> wrote:
>
>
>
> On 20/07/2024 8:45 am, Ian Rogers wrote:
> > Andi Kleen reported a regression where `perf script +F metric` would
> > crash. With this change the output is:
> >
> > ```
> > $ perf record -a -e '{cycles,instructions}:S' perf bench mem memcpy
> >
> > 21.229620 GB/sec
> >
> > 15.751008 GB/sec
> >
> > 16.009221 GB/sec
> > [ perf record: Woken up 1 times to write data ]
> > [ perf record: Captured and wrote 1.945 MB perf.data (294 samples) ]
> > $ perf --no-pager script -F +metric
> > perf 1912464 [000] 814503.473101: 6325 cycles: ffffffff8548d64a native_write_msr+0xa ([kernel.kallsyms])
> > perf 1912464 [000] 814503.473101: metric: 0.06 insn per cycle
> > perf 1912464 [000] 814503.473101: 351 instructions: ffffffff8548d64a native_write_msr+0xa ([kernel.kallsyms])
> > perf 1912464 [000] 814503.473101: metric: 0.03 insn per cycle
> > ...
> > ```
>
> For some reason I only get the metric: lines when I record with -a. I
> noticed this because Andi's test doesn't use -a so it fails.
>
> I'm not sure if that's expected or it's related to your disclaimer below?

It is. When you don't do -a the cpu map just contains -1 and for some
reason it is busted. The whole indirections to arrays of arrays,
counts, stats, aggregations, with indices into various other arrays
and a lack of helpers. The code works for perf stat, but there is a
lot of complexity that I don't fully grok in that. Here I've tried to
kind of break down what the code is trying to do in the comments, but
the old code never did sample_read_group__for_each so was is broken
with leader sampling? Is the leader sampling pretending the read
counts are periods and calling process sample multiple times. Andi
likely knows this code better than me so I was hoping he could fix it
up. We may want to take the patches anyway in order to not have a
segv.

Thanks,
Ian

> >
> > The change fixes perf script to update counts and thereby aggregate
> > values which then get consumed by unchanged metric logic in the shadow
> > stat output. Note, it would be preferential to switch to json metrics.
> >
> > Reported-by: Andi Kleen <ak@xxxxxxxxxxxxxxx>
> > Closes: https://lore.kernel.org/linux-perf-users/20240713155443.1665378-1-ak@xxxxxxxxxxxxxxx/
> > Fixes: 37cc8ad77cf8 ("perf metric: Directly use counts rather than saved_value")'
> > Signed-off-by: Ian Rogers <irogers@xxxxxxxxxx>
> > ---
> > The code isn't well tested nor does it support non-leader sampling
> > reading of counts based on periods that seemed to present in the
> > previous code. Sending out for the sake of discussion. Andi's changes
> > added a test and that should certainly be added.
> > ---
> > tools/perf/builtin-script.c | 114 +++++++++++++++++++++++++++++-------
> > 1 file changed, 93 insertions(+), 21 deletions(-)
> >