On Tue, Jul 23, 2024 at 7:41 AM James Clark <james.clark@xxxxxxxxxx> wrote:
On 20/07/2024 8:45 am, Ian Rogers wrote:
Andi Kleen reported a regression where `perf script +F metric` would
crash. With this change the output is:
```
$ perf record -a -e '{cycles,instructions}:S' perf bench mem memcpy
21.229620 GB/sec
15.751008 GB/sec
16.009221 GB/sec
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 1.945 MB perf.data (294 samples) ]
$ perf --no-pager script -F +metric
perf 1912464 [000] 814503.473101: 6325 cycles: ffffffff8548d64a native_write_msr+0xa ([kernel.kallsyms])
perf 1912464 [000] 814503.473101: metric: 0.06 insn per cycle
perf 1912464 [000] 814503.473101: 351 instructions: ffffffff8548d64a native_write_msr+0xa ([kernel.kallsyms])
perf 1912464 [000] 814503.473101: metric: 0.03 insn per cycle
...
```
For some reason I only get the metric: lines when I record with -a. I
noticed this because Andi's test doesn't use -a so it fails.
I'm not sure if that's expected or it's related to your disclaimer below?
It is. When you don't do -a the cpu map just contains -1 and for some
reason it is busted. The whole indirections to arrays of arrays,
counts, stats, aggregations, with indices into various other arrays
and a lack of helpers. The code works for perf stat, but there is a
lot of complexity that I don't fully grok in that. Here I've tried to
kind of break down what the code is trying to do in the comments, but
the old code never did sample_read_group__for_each so was is broken
with leader sampling? Is the leader sampling pretending the read
counts are periods and calling process sample multiple times. Andi
likely knows this code better than me so I was hoping he could fix it
up. We may want to take the patches anyway in order to not have a
segv.
Thanks,
Ian
The change fixes perf script to update counts and thereby aggregate
values which then get consumed by unchanged metric logic in the shadow
stat output. Note, it would be preferential to switch to json metrics.
Reported-by: Andi Kleen <ak@xxxxxxxxxxxxxxx>
Closes: https://lore.kernel.org/linux-perf-users/20240713155443.1665378-1-ak@xxxxxxxxxxxxxxx/
Fixes: 37cc8ad77cf8 ("perf metric: Directly use counts rather than saved_value")'
Signed-off-by: Ian Rogers <irogers@xxxxxxxxxx>
---
The code isn't well tested nor does it support non-leader sampling
reading of counts based on periods that seemed to present in the
previous code. Sending out for the sake of discussion. Andi's changes
added a test and that should certainly be added.
---
tools/perf/builtin-script.c | 114 +++++++++++++++++++++++++++++-------
1 file changed, 93 insertions(+), 21 deletions(-)