[PATCH v5 00/17] perf c2c: Support data source and display for Arm64

From: Leo Yan
Date: Sat Jun 04 2022 - 00:28:52 EST


Arm64 Neoverse CPUs supports data source in Arm SPE trace, this allows
us to detect cache line contention and transfers.

This patch set includes Ali's patch set v9 "perf: arm-spe: Decode SPE
source and use for perf c2c" [1] and rebased on the latest perf core
banch with latest commit 1bcca2b1bd67 ("perf vendor events intel:
Update metrics for Alderlake").

Patches 01-05 comes from Ali's patch set to support data source for Arm
SPE for neoverse cores.

Patches 06-17 are patches from patch set v4 for support perf c2c peer
display for Arm64 [2].

This patch set has been verified for both x86 perf memory events and Arm
SPE events.

[1] https://lore.kernel.org/lkml/20220517020326.18580-1-alisaidi@xxxxxxxxxx/
[2] https://lore.kernel.org/lkml/20220530114036.3225544-1-leo.yan@xxxxxxxxxx/

Changes from v4:
* Included Ali's patch set for adding data source in Arm SPE samples;
* Added Ian's ACK and Ali's review and test tags;
* Update document for the default peer dispaly for Arm64 (Ali).

Changes from v3:
* Changed to display remote and local peer accesses (Joe);
* Fixed the usage info for display types (Joe);
* Do not display HITM dimensions when use 'peer' display, and HITM
display doesn't show any 'peer' dimensions (James);
* Split to smaller patches for adding dimensions of peer operations;
* Updated documentation to reflect the latest GUI and stdio.

Changes from v2:
* Updated patch 04 to account metrics for both cache level and ld_peer
for PEER flag;
* Updated document for metric 'rmt_hit' which is accounted for all
remote accesses (include remote DRAM and any upward caches).

Changes from v1:
* Updated patches 01, 02 and 03 to support 'N/A' metrics for store
operations, so can align with the patch set [1] for store samples.


Ali Saidi (3):
perf: Add SNOOP_PEER flag to perf mem data struct
perf tools: sync addition of PERF_MEM_SNOOPX_PEER
perf arm-spe: Use SPE data source for neoverse cores

Leo Yan (14):
perf mem: Print snoop peer flag
perf arm-spe: Don't set data source if it's not a memory operation
perf mem: Add statistics for peer snooping
perf c2c: Output statistics for peer snooping
perf c2c: Add dimensions for peer load operations
perf c2c: Add dimensions of peer metrics for cache line view
perf c2c: Add mean dimensions for peer operations
perf c2c: Use explicit names for display macros
perf c2c: Rename dimension from 'percent_hitm' to
'percent_costly_snoop'
perf c2c: Refactor node header
perf c2c: Refactor display string
perf c2c: Sort on peer snooping for load operations
perf c2c: Use 'peer' as default display for Arm64
perf c2c: Update documentation for new display option 'peer'

include/uapi/linux/perf_event.h | 2 +-
tools/include/uapi/linux/perf_event.h | 2 +-
tools/perf/Documentation/perf-c2c.txt | 31 +-
tools/perf/builtin-c2c.c | 454 ++++++++++++++----
.../util/arm-spe-decoder/arm-spe-decoder.c | 1 +
.../util/arm-spe-decoder/arm-spe-decoder.h | 12 +
tools/perf/util/arm-spe.c | 140 +++++-
tools/perf/util/mem-events.c | 46 +-
tools/perf/util/mem-events.h | 3 +
9 files changed, 550 insertions(+), 141 deletions(-)

--
2.25.1