Re: [PATCH v2 00/11] perf c2c: Sort cacheline with all loads
From: Jiri Olsa
Date: Sun Jan 03 2021 - 17:54:12 EST
On Sun, Dec 13, 2020 at 01:38:39PM +0000, Leo Yan wrote:
> This patch set is to sort cache line for all load operations which hit
> any cache levels. For single cache line view, it shows the load
> references for loads with cache hits and with cache misses respectively.
>
> This series is a following for the old patch set "perf c2c: Sort
> cacheline with LLC load" [1], in the old patch set it tries to sort
> cache line with the load operations in last level cache (LLC), after
> testing we found the trace data doesn't contain LLC events if the
> platform isn't a NUMA system. For this reason, this series refines the
> implementation to sort on all cache levels hits of load operations; it's
> reasonable for us to review the load and store opreations, if detects
> any cache line is accessed by multi-threads, this hints that the cache
> line is possible for false sharing.
>
> This patch set is clearly applied on perf/core branch with the latest
> commit db0ea13cc741 ("perf evlist: Use the right prefix for 'struct
> evlist' record methods"). And the changes has been tested on x86 and
> Arm64, the testing result is shown as below.
SNIP
>
>
> [...]
>
> Changes from v1:
> * Changed from sorting on LLC to sorting on all loads with cache hits;
> * Added patches 06/11, 07/11 for refactoring macros;
> * Added patch 08/11 for refactoring node header, so can display "%loads"
> rather than "%hitms" in the header;
> * Added patch 09/11 to add local pointers for pointing to output metrics
> string and sort string (Juri);
> * Added warning in percent_hitm() for the display "all", which should
> never happen (Juri).
>
> [1] https://lore.kernel.org/patchwork/cover/1321514/
>
>
> Leo Yan (11):
> perf c2c: Add dimensions for total load hit
> perf c2c: Add dimensions for load hit
> perf c2c: Add dimensions for load miss
> perf c2c: Rename for shared cache line stats
> perf c2c: Refactor hist entry validation
> perf c2c: Refactor display filter macro
> perf c2c: Refactor node display macro
> perf c2c: Refactor node header
> perf c2c: Add local variables for output metrics
> perf c2c: Sort on all cache hit for load operations
> perf c2c: Update documentation for display option 'all'
>
> tools/perf/Documentation/perf-c2c.txt | 21 +-
> tools/perf/builtin-c2c.c | 548 ++++++++++++++++++++++----
> 2 files changed, 487 insertions(+), 82 deletions(-)
Joe might want to test it first, but it looks all good to me:
Acked-by: Jiri Olsa <jolsa@xxxxxxxxxx>
thanks,
jirka