Re: [PATCH 7/8] perf vendor events: Update events for Tigerlake

From: Xing Zhengjun
Date: Fri Mar 18 2022 - 05:22:33 EST




On 3/18/2022 2:28 AM, Ian Rogers wrote:
The change:
https://github.com/intel/event-converter-for-linux-perf/commit/fc680410402e394eed6a1ebd909c9f649d3ed3ef
moved certain "other" type of events in to the cache and pipeline topics.
Update the perf json files for this change.

Signed-off-by: Ian Rogers <irogers@xxxxxxxxxx>

Reviewed-by: Zhengjun Xing <zhengjun.xing@xxxxxxxxxxxxxxx>

---
.../pmu-events/arch/x86/tigerlake/cache.json | 86 ++++++++++++
.../pmu-events/arch/x86/tigerlake/other.json | 129 ------------------
.../arch/x86/tigerlake/pipeline.json | 43 ++++++
3 files changed, 129 insertions(+), 129 deletions(-)

diff --git a/tools/perf/pmu-events/arch/x86/tigerlake/cache.json b/tools/perf/pmu-events/arch/x86/tigerlake/cache.json
index 543a3298f86f..0569b2c704ca 100644
--- a/tools/perf/pmu-events/arch/x86/tigerlake/cache.json
+++ b/tools/perf/pmu-events/arch/x86/tigerlake/cache.json
@@ -493,6 +493,48 @@
"SampleAfterValue": "50021",
"UMask": "0x20"
},
+ {
+ "BriefDescription": "Counts demand data reads that hit a cacheline in the L3 where a snoop hit in another cores caches, data forwarding is required as the data is modified.",
+ "CollectPEBSRecord": "2",
+ "Counter": "0,1,2,3",
+ "EventCode": "0xB7, 0xBB",
+ "EventName": "OCR.DEMAND_DATA_RD.L3_HIT.SNOOP_HITM",
+ "MSRIndex": "0x1a6,0x1a7",
+ "MSRValue": "0x10003C0001",
+ "Offcore": "1",
+ "PEBScounters": "0,1,2,3",
+ "PublicDescription": "Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
+ "SampleAfterValue": "100003",
+ "UMask": "0x1"
+ },
+ {
+ "BriefDescription": "OCR.DEMAND_DATA_RD.L3_HIT.SNOOP_HIT_WITH_FWD",
+ "CollectPEBSRecord": "2",
+ "Counter": "0,1,2,3",
+ "EventCode": "0xB7, 0xBB",
+ "EventName": "OCR.DEMAND_DATA_RD.L3_HIT.SNOOP_HIT_WITH_FWD",
+ "MSRIndex": "0x1a6,0x1a7",
+ "MSRValue": "0x8003C0001",
+ "Offcore": "1",
+ "PEBScounters": "0,1,2,3",
+ "PublicDescription": "Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
+ "SampleAfterValue": "100003",
+ "UMask": "0x1"
+ },
+ {
+ "BriefDescription": "Counts demand reads for ownership (RFO) requests and software prefetches for exclusive ownership (PREFETCHW) that hit a cacheline in the L3 where a snoop hit in another cores caches, data forwarding is required as the data is modified.",
+ "CollectPEBSRecord": "2",
+ "Counter": "0,1,2,3",
+ "EventCode": "0xB7, 0xBB",
+ "EventName": "OCR.DEMAND_RFO.L3_HIT.SNOOP_HITM",
+ "MSRIndex": "0x1a6,0x1a7",
+ "MSRValue": "0x10003C0002",
+ "Offcore": "1",
+ "PEBScounters": "0,1,2,3",
+ "PublicDescription": "Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
+ "SampleAfterValue": "100003",
+ "UMask": "0x1"
+ },
{
"BriefDescription": "Demand and prefetch data reads",
"CollectPEBSRecord": "2",
@@ -627,5 +669,49 @@
"PublicDescription": "Counts the cycles for which the thread is active and the superQ cannot take any more entries.",
"SampleAfterValue": "100003",
"UMask": "0x4"
+ },
+ {
+ "BriefDescription": "Number of PREFETCHNTA instructions executed.",
+ "CollectPEBSRecord": "2",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x32",
+ "EventName": "SW_PREFETCH_ACCESS.NTA",
+ "PEBScounters": "0,1,2,3",
+ "PublicDescription": "Counts the number of PREFETCHNTA instructions executed.",
+ "SampleAfterValue": "100003",
+ "UMask": "0x1"
+ },
+ {
+ "BriefDescription": "Number of PREFETCHW instructions executed.",
+ "CollectPEBSRecord": "2",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x32",
+ "EventName": "SW_PREFETCH_ACCESS.PREFETCHW",
+ "PEBScounters": "0,1,2,3",
+ "PublicDescription": "Counts the number of PREFETCHW instructions executed.",
+ "SampleAfterValue": "100003",
+ "UMask": "0x8"
+ },
+ {
+ "BriefDescription": "Number of PREFETCHT0 instructions executed.",
+ "CollectPEBSRecord": "2",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x32",
+ "EventName": "SW_PREFETCH_ACCESS.T0",
+ "PEBScounters": "0,1,2,3",
+ "PublicDescription": "Counts the number of PREFETCHT0 instructions executed.",
+ "SampleAfterValue": "100003",
+ "UMask": "0x2"
+ },
+ {
+ "BriefDescription": "Number of PREFETCHT1 or PREFETCHT2 instructions executed.",
+ "CollectPEBSRecord": "2",
+ "Counter": "0,1,2,3",
+ "EventCode": "0x32",
+ "EventName": "SW_PREFETCH_ACCESS.T1_T2",
+ "PEBScounters": "0,1,2,3",
+ "PublicDescription": "Counts the number of PREFETCHT1 or PREFETCHT2 instructions executed.",
+ "SampleAfterValue": "100003",
+ "UMask": "0x4"
}
]
\ No newline at end of file
diff --git a/tools/perf/pmu-events/arch/x86/tigerlake/other.json b/tools/perf/pmu-events/arch/x86/tigerlake/other.json
index b1143fe74246..304cd09fe159 100644
--- a/tools/perf/pmu-events/arch/x86/tigerlake/other.json
+++ b/tools/perf/pmu-events/arch/x86/tigerlake/other.json
@@ -43,48 +43,6 @@
"SampleAfterValue": "200003",
"UMask": "0x20"
},
- {
- "BriefDescription": "Counts demand data reads that hit a cacheline in the L3 where a snoop hit in another cores caches, data forwarding is required as the data is modified.",
- "CollectPEBSRecord": "2",
- "Counter": "0,1,2,3",
- "EventCode": "0xB7, 0xBB",
- "EventName": "OCR.DEMAND_DATA_RD.L3_HIT.SNOOP_HITM",
- "MSRIndex": "0x1a6,0x1a7",
- "MSRValue": "0x10003C0001",
- "Offcore": "1",
- "PEBScounters": "0,1,2,3",
- "PublicDescription": "Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
- "SampleAfterValue": "100003",
- "UMask": "0x1"
- },
- {
- "BriefDescription": "OCR.DEMAND_DATA_RD.L3_HIT.SNOOP_HIT_WITH_FWD",
- "CollectPEBSRecord": "2",
- "Counter": "0,1,2,3",
- "EventCode": "0xB7, 0xBB",
- "EventName": "OCR.DEMAND_DATA_RD.L3_HIT.SNOOP_HIT_WITH_FWD",
- "MSRIndex": "0x1a6,0x1a7",
- "MSRValue": "0x8003C0001",
- "Offcore": "1",
- "PEBScounters": "0,1,2,3",
- "PublicDescription": "Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
- "SampleAfterValue": "100003",
- "UMask": "0x1"
- },
- {
- "BriefDescription": "Counts demand reads for ownership (RFO) requests and software prefetches for exclusive ownership (PREFETCHW) that hit a cacheline in the L3 where a snoop hit in another cores caches, data forwarding is required as the data is modified.",
- "CollectPEBSRecord": "2",
- "Counter": "0,1,2,3",
- "EventCode": "0xB7, 0xBB",
- "EventName": "OCR.DEMAND_RFO.L3_HIT.SNOOP_HITM",
- "MSRIndex": "0x1a6,0x1a7",
- "MSRValue": "0x10003C0002",
- "Offcore": "1",
- "PEBScounters": "0,1,2,3",
- "PublicDescription": "Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
- "SampleAfterValue": "100003",
- "UMask": "0x1"
- },
{
"BriefDescription": "Counts streaming stores that have any type of response.",
"CollectPEBSRecord": "2",
@@ -98,92 +56,5 @@
"PublicDescription": "Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
"SampleAfterValue": "100003",
"UMask": "0x1"
- },
- {
- "BriefDescription": "Number of PREFETCHNTA instructions executed.",
- "CollectPEBSRecord": "2",
- "Counter": "0,1,2,3",
- "EventCode": "0x32",
- "EventName": "SW_PREFETCH_ACCESS.NTA",
- "PEBScounters": "0,1,2,3",
- "PublicDescription": "Counts the number of PREFETCHNTA instructions executed.",
- "SampleAfterValue": "100003",
- "UMask": "0x1"
- },
- {
- "BriefDescription": "Number of PREFETCHW instructions executed.",
- "CollectPEBSRecord": "2",
- "Counter": "0,1,2,3",
- "EventCode": "0x32",
- "EventName": "SW_PREFETCH_ACCESS.PREFETCHW",
- "PEBScounters": "0,1,2,3",
- "PublicDescription": "Counts the number of PREFETCHW instructions executed.",
- "SampleAfterValue": "100003",
- "UMask": "0x8"
- },
- {
- "BriefDescription": "Number of PREFETCHT0 instructions executed.",
- "CollectPEBSRecord": "2",
- "Counter": "0,1,2,3",
- "EventCode": "0x32",
- "EventName": "SW_PREFETCH_ACCESS.T0",
- "PEBScounters": "0,1,2,3",
- "PublicDescription": "Counts the number of PREFETCHT0 instructions executed.",
- "SampleAfterValue": "100003",
- "UMask": "0x2"
- },
- {
- "BriefDescription": "Number of PREFETCHT1 or PREFETCHT2 instructions executed.",
- "CollectPEBSRecord": "2",
- "Counter": "0,1,2,3",
- "EventCode": "0x32",
- "EventName": "SW_PREFETCH_ACCESS.T1_T2",
- "PEBScounters": "0,1,2,3",
- "PublicDescription": "Counts the number of PREFETCHT1 or PREFETCHT2 instructions executed.",
- "SampleAfterValue": "100003",
- "UMask": "0x4"
- },
- {
- "BriefDescription": "TMA slots where no uops were being issued due to lack of back-end resources.",
- "CollectPEBSRecord": "2",
- "Counter": "0,1,2,3,4,5,6,7",
- "EventCode": "0xa4",
- "EventName": "TOPDOWN.BACKEND_BOUND_SLOTS",
- "PEBScounters": "0,1,2,3,4,5,6,7",
- "PublicDescription": "Counts the number of Top-down Microarchitecture Analysis (TMA) method's slots where no micro-operations were being issued from front-end to back-end of the machine due to lack of back-end resources.",
- "SampleAfterValue": "10000003",
- "UMask": "0x2"
- },
- {
- "BriefDescription": "TMA slots wasted due to incorrect speculation by branch mispredictions",
- "CollectPEBSRecord": "2",
- "Counter": "0,1,2,3,4,5,6,7",
- "EventCode": "0xa4",
- "EventName": "TOPDOWN.BR_MISPREDICT_SLOTS",
- "PEBScounters": "0,1,2,3,4,5,6,7",
- "PublicDescription": "Number of TMA slots that were wasted due to incorrect speculation by branch mispredictions. This event estimates number of operations that were issued but not retired from the specualtive path as well as the out-of-order engine recovery past a branch misprediction.",
- "SampleAfterValue": "10000003",
- "UMask": "0x8"
- },
- {
- "BriefDescription": "TMA slots available for an unhalted logical processor. Fixed counter - architectural event",
- "CollectPEBSRecord": "2",
- "Counter": "Fixed counter 3",
- "EventName": "TOPDOWN.SLOTS",
- "PEBScounters": "35",
- "PublicDescription": "Number of available slots for an unhalted logical processor. The event increments by machine-width of the narrowest pipeline as employed by the Top-down Microarchitecture Analysis method (TMA). The count is distributed among unhalted logical processors (hyper-threads) who share the same physical core. Software can use this event as the denominator for the top-level metrics of the TMA method. This architectural event is counted on a designated fixed counter (Fixed Counter 3).",
- "SampleAfterValue": "10000003",
- "UMask": "0x4"
- },
- {
- "BriefDescription": "TMA slots available for an unhalted logical processor. General counter - architectural event",
- "CollectPEBSRecord": "2",
- "Counter": "0,1,2,3,4,5,6,7",
- "EventCode": "0xa4",
- "EventName": "TOPDOWN.SLOTS_P",
- "PEBScounters": "0,1,2,3,4,5,6,7",
- "PublicDescription": "Counts the number of available slots for an unhalted logical processor. The event increments by machine-width of the narrowest pipeline as employed by the Top-down Microarchitecture Analysis method. The count is distributed among unhalted logical processors (hyper-threads) who share the same physical core.",
- "SampleAfterValue": "10000003",
- "UMask": "0x1"
}
]
\ No newline at end of file
diff --git a/tools/perf/pmu-events/arch/x86/tigerlake/pipeline.json b/tools/perf/pmu-events/arch/x86/tigerlake/pipeline.json
index 4dc3a16e3da4..d436775c80db 100644
--- a/tools/perf/pmu-events/arch/x86/tigerlake/pipeline.json
+++ b/tools/perf/pmu-events/arch/x86/tigerlake/pipeline.json
@@ -711,6 +711,49 @@
"SampleAfterValue": "100003",
"UMask": "0x1"
},
+ {
+ "BriefDescription": "TMA slots where no uops were being issued due to lack of back-end resources.",
+ "CollectPEBSRecord": "2",
+ "Counter": "0,1,2,3,4,5,6,7",
+ "EventCode": "0xa4",
+ "EventName": "TOPDOWN.BACKEND_BOUND_SLOTS",
+ "PEBScounters": "0,1,2,3,4,5,6,7",
+ "PublicDescription": "Counts the number of Top-down Microarchitecture Analysis (TMA) method's slots where no micro-operations were being issued from front-end to back-end of the machine due to lack of back-end resources.",
+ "SampleAfterValue": "10000003",
+ "UMask": "0x2"
+ },
+ {
+ "BriefDescription": "TMA slots wasted due to incorrect speculation by branch mispredictions",
+ "CollectPEBSRecord": "2",
+ "Counter": "0,1,2,3,4,5,6,7",
+ "EventCode": "0xa4",
+ "EventName": "TOPDOWN.BR_MISPREDICT_SLOTS",
+ "PEBScounters": "0,1,2,3,4,5,6,7",
+ "PublicDescription": "Number of TMA slots that were wasted due to incorrect speculation by branch mispredictions. This event estimates number of operations that were issued but not retired from the specualtive path as well as the out-of-order engine recovery past a branch misprediction.",
+ "SampleAfterValue": "10000003",
+ "UMask": "0x8"
+ },
+ {
+ "BriefDescription": "TMA slots available for an unhalted logical processor. Fixed counter - architectural event",
+ "CollectPEBSRecord": "2",
+ "Counter": "Fixed counter 3",
+ "EventName": "TOPDOWN.SLOTS",
+ "PEBScounters": "35",
+ "PublicDescription": "Number of available slots for an unhalted logical processor. The event increments by machine-width of the narrowest pipeline as employed by the Top-down Microarchitecture Analysis method (TMA). The count is distributed among unhalted logical processors (hyper-threads) who share the same physical core. Software can use this event as the denominator for the top-level metrics of the TMA method. This architectural event is counted on a designated fixed counter (Fixed Counter 3).",
+ "SampleAfterValue": "10000003",
+ "UMask": "0x4"
+ },
+ {
+ "BriefDescription": "TMA slots available for an unhalted logical processor. General counter - architectural event",
+ "CollectPEBSRecord": "2",
+ "Counter": "0,1,2,3,4,5,6,7",
+ "EventCode": "0xa4",
+ "EventName": "TOPDOWN.SLOTS_P",
+ "PEBScounters": "0,1,2,3,4,5,6,7",
+ "PublicDescription": "Counts the number of available slots for an unhalted logical processor. The event increments by machine-width of the narrowest pipeline as employed by the Top-down Microarchitecture Analysis method. The count is distributed among unhalted logical processors (hyper-threads) who share the same physical core.",
+ "SampleAfterValue": "10000003",
+ "UMask": "0x1"
+ },
{
"BriefDescription": "Number of uops decoded out of instructions exclusively fetched by decoder 0",
"CollectPEBSRecord": "2",

--
Zhengjun Xing