Re: [PATCH v5 2/5] perf: Add SNOOP_PEER flag to perf mem data struct

From: Leo Yan
Date: Fri Apr 29 2022 - 05:29:16 EST


On Wed, Apr 27, 2022 at 03:29:31PM -0400, Liang, Kan wrote:

[...]

> > It's a bit difficult for me to make decision is because now SNOOPX_FWD
> > is not used in the file util/mem-events.c, so I am not very sure if
> > SNOOPX_FWD has the consistent usage across different arches.
>
> No, it's used in the file util/mem-events.c
> See perf_mem__snp_scnprintf().

Right. Actually I mean FWD flag is not for statistics.

> > On the other hand, I sent a patch for 'peer' flag statistics [1], you
> > could review it and it only stats for L2 and L3 cache level for local
> > node.
>
> If it's for the local node, why don't you use the hop level which is
> introduced recently by Power? The below seems a good fit.
>
> PERF_MEM_LVLNUM_ANY_CACHE | PERF_MEM_HOPS_0?
>
> /* hop level */
> #define PERF_MEM_HOPS_0 0x01 /* remote core, same node */
> #define PERF_MEM_HOPS_1 0x02 /* remote node, same socket */
> #define PERF_MEM_HOPS_2 0x03 /* remote socket, same board */
> #define PERF_MEM_HOPS_3 0x04 /* remote board */
> /* 5-7 available */
> #define PERF_MEM_HOPS_SHIFT 43

Thanks for reminding. I have considered HOPS flags during the
discussion with Ali for the flags, you could see PERF_MEM_HOPS_0 is for
"remote core, same node", this could introduce confusion for Arm
Neoverse CPUs. Another thinking is how we consume the flags in perf c2c
tool, perf c2c tool uses snoop flag to find out the costly cache
conherency operations, if we use PERF_MEM_HOPS_0 that means we need to
extend perf c2c tool to support two kinds of flags: snoop flag and hop
flag, so it would introduce complexity for perf c2c code.

If we step back to review current flags, we can see different arches
have different memory model (and implementations), it is a bit painful
when we try to unify the flags. So at current stage, I prefer to use
PEER flag for Arm arches, essentially it's not too bad that now we
introduce one bit, and later we may consider to consolidate memory
flags in more general way.

Thanks,
Leo