Re: [PATCH] perf metricgroup: Fix system PMU metrics

From: John Garry
Date: Wed Jan 20 2021 - 05:25:44 EST


On 20/01/2021 05:15, Joakim Zhang wrote:

-----Original Message-----
From: John Garry <john.garry@xxxxxxxxxx>
Sent: 2021年1月20日 1:33
To: Joakim Zhang <qiangqing.zhang@xxxxxxx>; peterz@xxxxxxxxxxxxx;
mingo@xxxxxxxxxx; acme@xxxxxxxxxx; mark.rutland@xxxxxxx;
alexander.shishkin@xxxxxxxxxxxxxxx; jolsa@xxxxxxxxxx;
namhyung@xxxxxxxxxx; irogers@xxxxxxxxxx; kjain@xxxxxxxxxxxxx
Cc: linux-kernel@xxxxxxxxxxxxxxx; linuxarm@xxxxxxxxxxxxx
Subject: Re: [PATCH] perf metricgroup: Fix system PMU metrics

On 19/01/2021 15:47, John Garry wrote:
On 19/01/2021 10:56, Joakim Zhang wrote:
It seems have other issue compared to 5.10 kernel after switching to
this framework, below metric can't work.
"MetricExpr": "(( imx8_ddr0@read\\-cycles@ +
imx8_ddr0@write\\-cycles@
) * 4 * 4 / duration_time) / (750 * 1000000 * 4 * 4)"
After change to:
"MetricExpr": "(( imx8mm_ddr.read_cycles + imx8mm_ddr.write_cycles )
*
4 * 4 / duration_time) / (750 * 1000000 * 4 * 4)",

It seems that any metric which includes "duration_time" is broken,
even on x86:

john@localhost:~/acme/tools/perf> sudo ./perf stat -v -M
L1D_Cache_Fill_BW sleep 1 Using CPUID GenuineIntel-6-3D-4 metric expr
64 * l1d.replacement / 1000000000 / duration_time for
L1D_Cache_Fill_BW found event duration_time found event
l1d.replacement adding {l1d.replacement}:W,duration_time
l1d.replacement -> cpu/umask=0x1,(null)=0x1e8483,event=0x51/
Segmentation fault


Seems to be from my commit c2337d67199 ("perf metricgroup: Fix metrics
using aliases covering multiple PMUs")

I'll look to fix it now.


Please try this:

From 2380f1ef0250e6818b3dbc7bff4a868810875e2a Mon Sep 17 00:00:00
2001
From: John Garry <john.garry@xxxxxxxxxx>
Date: Tue, 19 Jan 2021 17:29:54 +0000
Subject: [PATCH] perf metricgroup: Fix metric support for duration_time

For a metric using duration_time, the strcmp() check when finding identical
events in metric_events[] is broken, as it does not consider that the
event pmu_name is NULL - it would be for duration_time.

As such, add a NULL check here for event pmu_name.

Signed-off-by: John Garry <john.garry@xxxxxxxxxx>

diff --git a/tools/perf/util/metricgroup.c b/tools/perf/util/metricgroup.c
index ee94d3e8dd65..277adff8017f 100644
--- a/tools/perf/util/metricgroup.c
+++ b/tools/perf/util/metricgroup.c
@@ -280,6 +280,8 @@ static struct evsel *find_evsel_group(struct evlist
*perf_evlist,
*/
if (!has_constraint &&
ev->leader != metric_events[i]->leader &&
+ ev->leader->pmu_name &&
+ metric_events[i]->leader->pmu_name &&
!strcmp(ev->leader->pmu_name,
metric_events[i]->leader->pmu_name))
break;
--
2.26.2



For this patch: Tested-by: Joakim Zhang <qiangqing.zhang@xxxxxxx>

Hi John, Jolsa,

Is there any way to avoid breaking exist metric expressions? If not, it will always happened after metricgroup changes.


They are not normally broken like that. Normally we test beforehand, but these cases were missed here by me. However if you were testing them previously, then it would be expected that you had tested them again for the final patchset which was merged.

Anyway, we can look to add metric tests for these.

@Arnaldo, I will send separate formal patch for this today.

Thanks,
John