[PATCH 2/2] perf stat: balance opening/closing of events

From: Mark Rutland
Date: Fri Apr 28 2017 - 07:48:11 EST


While perf-stat has its own create_perf_stat_counter() helper to open
events, dependent on target configuration, it uses perf_evlist__close()
to close events.

The common perf_evlist__{open,close}() helpers don't consider the target
configuration, and always evsel->cpus even where
create_perf_stat_counter() would have used an empty_cpu_map.

On some systems, the CPU PMU describes cpus under sysfs, and evsel->cpus
may be set even when we open per-thread events. For per-thread events,
we don't use evsel->cpus in create_perf_stat_counter(), though
perf_evlist__close() will. Thus we try to close more events than we
allocated and opened, resulting in segfaults when we go out-of-bounds:

$ perf stat -e armv8_pmuv3/cpu_cycles/ true
*** Error in `./old-perf': free(): invalid next size (fast): 0x00000000263a55c0 ***
Aborted

This problem was introduced by commit:

3df33eff2ba96be4 ("perf stat: Avoid skew when reading events")

... prior to that commit, we closed events as we read them, using the
correct number of CPUs.

This patch fixes the problem by adding a new close_counters() routine
that mirrors create_perf_stat_counter(), ensuring that we always
correctly balance open/close.

Fixes: 3df33eff2ba96be4 ("perf stat: Avoid skew when reading events")
Signed-off-by: Mark Rutland <mark.rutland@xxxxxxx>
Tested-by: Ganapatrao Kulkarni <ganapatrao.kulkarni@xxxxxxxxxx>
Cc: Arnaldo Carvalho de Melo <acme@xxxxxxxxxx>
Cc: Alexander Shishkin <alexander.shishkin@xxxxxxxxxxxxxxx>
Cc: linux-kernel@xxxxxxxxxxxxxxx
---
tools/perf/builtin-stat.c | 24 +++++++++++++++++++++++-
1 file changed, 23 insertions(+), 1 deletion(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index a935b50..ceedc0a 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -364,6 +364,28 @@ static void read_counters(void)
}
}

+/*
+ * Close all evnt FDs we open in __run_perf_stat() and
+ * create_perf_stat_counter(), taking care to match the number of threads and CPUs.
+ *
+ * Note that perf_evlist__close(evsel_list) is not equivalent, as it doesn't
+ * take the target into account.
+ */
+static void close_counters(void)
+{
+ bool per_cpu = target__has_cpu(&target);
+ struct perf_evsel *evsel;
+
+ evlist__for_each_entry(evsel_list, evsel) {
+ if (per_cpu)
+ perf_evsel__close_per_cpu(evsel,
+ perf_evsel__cpus(evsel));
+ else
+ perf_evsel__close_per_thread(evsel,
+ evsel_list->threads);
+ }
+}
+
static void process_interval(void)
{
struct timespec ts, rs;
@@ -704,7 +726,7 @@ static int __run_perf_stat(int argc, const char **argv)
* group leaders.
*/
read_counters();
- perf_evlist__close(evsel_list);
+ close_counters();

return WEXITSTATUS(status);
}
--
1.9.1