Re: perf tool: Issues with metricgroups
From: Ian Rogers
Date: Wed Jun 09 2021 - 02:16:55 EST
On Mon, Jun 7, 2021 at 10:21 AM John Garry <john.garry@xxxxxxxxxx> wrote:
>
> Hi guys,
>
> I am finding a couple of issues in metricgroup support. Firstly I have
> found a segfault (which I caused with my "fix" in commit 9c880c24cb0d),
> and a fix for that is at the bottom.
>
> Another issue is that the ordering of the metrics we supply for stat
> command causes an issue.
>
> On my broadwell, this works ok:
>
> sudo ./perf stat -vv -M backend_bound,retiring sleep 1
> Using CPUID GenuineIntel-6-3D-4
> metric expr 1 - ( (idq_uops_not_delivered.core / (4 * cycles)) + ((
> uops_issued.any - uops_retired.retire_slots + 4 *
> int_misc.recovery_cycles ) / (4 * cycles)) + (uops_retired.retire_slots
> / (4 * cycles)) ) for Backend_Bound
> found event uops_issued.any
> found event cycles
> found event idq_uops_not_delivered.core
> found event int_misc.recovery_cycles
> found event uops_retired.retire_slots
> metric expr uops_retired.retire_slots / (4 * cycles) for Retiring
> found event cycles
> found event uops_retired.retire_slots
> adding
> {uops_issued.any,cycles,idq_uops_not_delivered.core,int_misc.recovery_cycles,uops_retired.retire_slots}:W,{cycles,uops_retired.retire_slots}:W
> uops_issued.any -> cpu/umask=0x1,(null)=0x1e8483,event=0xe/
> idq_uops_not_delivered.core -> cpu/umask=0x1,(null)=0x1e8483,event=0x9c/
> int_misc.recovery_cycles ->
> cpu/umask=0x3,(null)=0x1e8483,cmask=0x1,event=0xd/
> uops_retired.retire_slots -> cpu/umask=0x2,(null)=0x1e8483,event=0xc2/
> uops_retired.retire_slots -> cpu/umask=0x2,(null)=0x1e8483,event=0xc2/
> Control descriptor is not initialized
> uops_issued.any: 1655376 547221 547221
> cycles: 1665343 547221 547221
> idq_uops_not_delivered.core: 1983394 547221 547221
> int_misc.recovery_cycles: 69571 547221 547221
> uops_retired.retire_slots: 1311124 547221 547221
>
> Performance counter stats for 'sleep 1':
>
> 1,655,376 uops_issued.any # 0.41
> Backend_Bound
> 1,665,343 cycles
>
> # 0.20 Retiring
>
> 1,983,394 idq_uops_not_delivered.core
>
> 69,571 int_misc.recovery_cycles
>
> 1,311,124 uops_retired.retire_slots
>
>
> 1.000992470 seconds time elapsed
>
> 0.001031000 seconds user
> 0.000000000 seconds sys
> ----/---
>
> But when I reorder the metrics, it fails:
>
> sudo ./perf stat -v -M retiring,backend_bound sleep 1
> Using CPUID GenuineIntel-6-3D-4
> metric expr uops_retired.retire_slots / (4 * cycles) for Retiring
> found event cycles
> found event uops_retired.retire_slots
> metric expr 1 - ( (idq_uops_not_delivered.core / (4 * cycles)) + ((
> uops_issued.any - uops_retired.retire_slots + 4 *
> int_misc.recovery_cycles ) / (4 * cycles)) + (uops_retired.retire_slots
> / (4 * cycles)) ) for Backend_Bound
> found event uops_issued.any
> found event cycles
> found event idq_uops_not_delivered.core
> found event int_misc.recovery_cycles
> found event uops_retired.retire_slots
> adding
> {cycles,uops_retired.retire_slots}:W,{uops_issued.any,cycles,idq_uops_not_delivered.core,int_misc.recovery_cycles,uops_retired.retire_slots}:W
> uops_retired.retire_slots -> cpu/umask=0x2,(null)=0x1e8483,event=0xc2/
> uops_issued.any -> cpu/umask=0x1,(null)=0x1e8483,event=0xe/
> idq_uops_not_delivered.core -> cpu/umask=0x1,(null)=0x1e8483,event=0x9c/
> int_misc.recovery_cycles ->
> cpu/umask=0x3,(null)=0x1e8483,cmask=0x1,event=0xd/
> uops_retired.retire_slots -> cpu/umask=0x2,(null)=0x1e8483,event=0xc2/
> Control descriptor is not initialized
> cycles: 1695223 563189 563189
> uops_retired.retire_slots: 1343463 563189 563189
> uops_issued.any: 0 563189 0
> cycles: 0 563189 0
> idq_uops_not_delivered.core: 0 563189 0
> int_misc.recovery_cycles: 0 563189 0
> uops_retired.retire_slots: 0 563189 0
>
> Performance counter stats for 'sleep 1':
>
> 1,695,223 cycles
>
> # 0.20 Retiring
>
> 1,343,463 uops_retired.retire_slots
>
> <not counted> uops_issued.any
> (0.00%)
> <not counted> cycles
> (0.00%)
> <not counted> idq_uops_not_delivered.core
> (0.00%)
> <not counted> int_misc.recovery_cycles
> (0.00%)
> <not counted> uops_retired.retire_slots
> (0.00%)
>
> 1.000980001 seconds time elapsed
>
> 0.001028000 seconds user
> 0.000000000 seconds sys
>
> ----/---
>
> I think that it may be related to changes when introducing hashmap for
> evsel (that's before I started fiddling with metricgroups).
> Specifically, it looks to be in metricgroup__setup_events() ->
> find_evsel_group(). We seem to be enabling wrong evsels.
>
> I'll continue to look at this, but any help would be appreciated.
>
> Thanks,
> John
Hi John,
The fix to avoid uncore_ events being deduplicated against each other
added complexity to the code and means that metric-no-group doesn't
really work any more. I have it on my list of things to look at. It
relates to what you are looking at as the deduplication afterward is
tricky given the funny invariants on evsel names. I think it would be
easier to deduplicate events before doing the event parse. It may also
be good to change evsels so that they own the string for their name
(this would mean uncore_imc events could have unique names and not get
deduplicated against each other). The invariants around cycles in your
change look weird, but I can see how it might workaround an issue. My
attempts to reproduce the issue weren't successful on a SkylakeX.
Thanks for reporting the issues. I planned to look at this logic to
fix metric-no-group, it'd be nice to land:
https://lore.kernel.org/lkml/20210112230434.2631593-1-irogers@xxxxxxxxxx/
just so that I'm not making patch sets that conflict with myself.
Ian
> perf metricgroup: Fix find_evsel_group()
>
> The following command segfaults on my x86 broadwell:
>
> $ ./perf stat -M frontend_bound,retiring,backend_bound,bad_speculation
> sleep 1
> WARNING: grouped events cpus do not match, disabling group:
> anon group { raw 0x10e }
> anon group { raw 0x10e }
> perf: util/evsel.c:1596: get_group_fd: Assertion `!(!leader->core.fd)'
> failed.
> Aborted (core dumped)
>
> The issue shows itself as a use-after-free in evlist__check_cpu_maps(),
> whereby the leader of an event selector (evsel) has been deleted (yet we
> still attempt to verify for an evsel).
>
> Fundamentally the problem comes from metricgroup__setup_events() ->
> find_evsel_group(), and has developed from the previous fix attempt in
> commit 9c880c24cb0d ("perf metricgroup: Fix for metrics containing
> duration_time").
>
> The problem now is that the logic in checking if an evsel is in the same
> group is subtly broken for "cycles" event. For "cycles" event, the
> pmu_name is NULL; however the logic in find_evsel_group() may set an
> event matched against "cycles" as used, when it should not be.
>
> This leads to a condition where an evsel is set, yet its leader is not.
>
> Fix the check for evsel pmu_name by not matching evsels when either has
> a NULL pmu_name.
>
> Fixes: 9c880c24cb0d ("perf metricgroup: Fix for metrics containing
> duration_time")
> Signed-off-by: John Garry <john.garry@xxxxxxxxxx>
>
> diff --git a/tools/perf/util/metricgroup.c b/tools/perf/util/metricgroup.c
> index 8336dd8e8098..c456fdeae06a 100644
> --- a/tools/perf/util/metricgroup.c
> +++ b/tools/perf/util/metricgroup.c
> @@ -162,10 +162,10 @@ static bool contains_event(struct evsel
> **metric_events, int num_events,
> return false;
> }
>
> -static bool evsel_same_pmu(struct evsel *ev1, struct evsel *ev2)
> +static bool evsel_same_pmu_or_none(struct evsel *ev1, struct evsel *ev2)
> {
> if (!ev1->pmu_name || !ev2->pmu_name)
> - return false;
> + return true;
>
> return !strcmp(ev1->pmu_name, ev2->pmu_name);
> }
> @@ -288,7 +288,7 @@ static struct evsel *find_evsel_group(struct evlist
> *perf_evlist,
> */
> if (!has_constraint &&
> ev->leader != metric_events[i]->leader &&
> - evsel_same_pmu(ev->leader,
> metric_events[i]->leader))
> + evsel_same_pmu_or_none(ev->leader,
> metric_events[i]->leader))
> break;
> if (!strcmp(metric_events[i]->name, ev->name)) {
> set_bit(ev->idx, evlist_used);
> lines 25-63/63 (END)
>
>