Re: [PATCH] perf vendor events arm64: Add hip08 implementation defined PMU core events

From: John Garry
Date: Thu Oct 19 2017 - 05:20:30 EST


I am looking at this topic now. But I am doubting the folder structure
again.

Firstly, we still have this comment in the README:
All the topic JSON files for a CPU model/family should be in a separate
sub directory.

Now, when thunderx3 or hip09 comes along, I assume that their jsons will
similarly go into cavium and hisilcon folders, respectively. So, for
example, we add thunderx3 json, like this:
arm64/cavium/thunderx3-imp-def.json
[
{
"PublicDescription": "foo",
"EventCode": "0x40",
"EventName": "bar",
"BriefDescription": "sieve",
},
]

Hang on, event 0x40 is one of the common ones, so that should be somewhere
else like arm64/armv8-common.json.

Hi Will,

The example I gave was poor, especially since 0x40 is one the events I am trying to commonise and it is also a recommended event.

It's the IMPLEMENTATION DEFINED events which I am worried about, which ThunderX2 does not seem to have.

Ok, as another example, consider this possibility of hip08 and hip09 not have the same event mappings for all IMPLEMENTATION DEFINED events, like this:

hip08 json:
[
{
"PublicDescription": "Attributable Level 1 data cache access, read",
"EventCode": "0x40",
"EventName": "L1D_CACHE_RD",
"BriefDescription": "L1D cache access, read",
},

[...]
{
"PublicDescription": "Cycles of that the number of issuing micro operations are less than 4",
"EventCode": "0x7001",
"EventName": "EXE_STALL_CYCLE",
"BriefDescription": "Cycles of that the number of issue ups are less than 4",
},

[...]
]


hip09 json is same as hip08, but IMPLEMENTATION DEFINED event 0x7001 has a different meaning:
{
"PublicDescription": "foo",
"EventCode": "0x7001",
"EventName": "bar",
"BriefDescription": "sieve",
},


So pmu_event.c looks like this:

#include "../../pmu-events/pmu-events.h"
struct pmu_event pme_cavium[] = {
{
.name = "l1d_cache_rd",
.event = "event=0x40",
.desc = "L1D cache read",
.topic = "thunderx2 imp def",
.long_desc = "Attributable Level 1 data cache access, read",
},

[ ... ]

{
.name = 0,
.event = 0,
.desc = 0,
},
};
struct pmu_event pme_hisilicon[] = {
{
.name = "l1d_cache_rd",
.event = "event=0x40",
.desc = "L1D cache access, read",
.topic = "hip08 imp def",
.long_desc = "Attributable Level 1 data cache access, read",
},
{
.name = "l1d_cache_wr",
.event = "event=0x41",
.desc = "L1D cache access, write",
.topic = "hip08 imp def",
.long_desc = "Attributable Level 1 data cache access, write",
},

[ ... ]

{
.name = "hit_on_prf",
.event = "event=0x6014",
.desc = "Hit on prefetched data",
.topic = "hip08 imp def",
.long_desc = "Hit on prefetched data",
},
{
.name = "exe_stall_cycle",
.event = "event=0x7001",
.desc = "Cycles of that the number of issue ups are less than 4",
.topic = "hip08 imp def",
.long_desc = "Cycles of that the number of issuing micro operations are less than 4",
},

[ ... ]
{
.name = "l1d_cache_rd",
.event = "event=0x40",
.desc = "L1D cache access, read",
.topic = "hip09 imp def",
.long_desc = "Attributable Level 1 data cache access, read",
},

[ ... ]

{
.name = "bar",
.event = "event=0x7001",
.desc = "sieve",
.topic = "hip09 imp def",
.long_desc = "foo",
},

[ ... ]

{
.name = 0,
.event = 0,
.desc = 0,
},
};
struct pmu_events_map pmu_events_map[] = {
{
.cpuid = "0x00000000420f5160",
.version = "v1",
.type = "core",
.table = pme_cavium
},
{
.cpuid = "0x00000000480fd010",
.version = "v1",
.type = "core",
.table = pme_hisilicon
},
{
.cpuid = "0x00000000480fd011",
.version = "v1",
.type = "core",
.table = pme_hisilicon
},
{
.cpuid = 0,
.version = 0,
.type = 0,
.table = 0,
},
};

So you can see conflicting entries for event 0x7001 in pme_hisilicon.

Note: cpuid is not valid, but just shown to be different to illustrate.


#Family-model,Version,Filename,EventType
0x00000000420f5160,v1,cavium,core
0x00000000420f5161,v1,cavium,core

Then we have generated pmu_events.c, like this:
#include "../../pmu-events/pmu-events.h"
struct pmu_event pme_cavium[] = {
{
.name = "bar",
.event = "event=0x40",
.desc = "sieve",
.topic = "thunderx3 imp def",
.long_desc = "foo",
},
{
.name = "l1d_cache_rd",
.event = "event=0x40",
.desc = "L1D cache read",
.topic = "thunderx2 imp def",
.long_desc = "Attributable Level 1 data cache access, read",
},
{
.name = "l1d_cache_wr",
.event = "event=0x41",
.desc = "L1D cache write",
.topic = "thunderx2 imp def",
.long_desc = "Attributable Level 1 data cache access, write ",
},

[ ... ]

{
.name = 0,
.event = 0,
.desc = 0,
},
};
struct pmu_events_map pmu_events_map[] = {
{
.cpuid = "0x00000000420f5160",
.version = "v1",
.type = "core",
.table = pme_cavium
},
{
.cpuid = "0x00000000420f5161",
.version = "v1",
.type = "core",
.table = pme_cavium
},
{
.cpuid = 0,
.version = 0,
.type = 0,
.table = 0,
},
};

It doesn't look right, espcially since we have conflicting definitions for
event 0x40. We really should have table per cpu.

There should be one definition of event 0x40 and it should use the wording
from the ARM ARM:

0x0040 , L1D_CACHE_RD, Attributable Level 1 data cache access, read
This event is similar to Level 1 data cache access, L1D_CACHE,
but the counter counts only memory-read operations that access
at least the Level 1 data or unified cache.

I think that the tricky bit is working out which subset of the
armv8-common.json file applies to a given CPU, but I was hoping you'd have
some ideas about that, even if it's an additional build step when building
perf to generate the final JSON files

I was looking at this, but I want to be happy with the current solution before I proceed.

Cheers,
John


Will

.