[PATCH 2/2] perf/bench: Double free of dynamic allocated memory
From: Thomas Richter
Date: Tue Mar 18 2025 - 05:54:24 EST
On s390 z/VM the command 'perf bench internals pmu-scan'
dumps core, as can be seen:
Output before:
# ./perf bench internals pmu-scan
# Running 'internals/pmu-scan' benchmark:
Computing performance of sysfs PMU event scan for 100 times
perf: /root/linux/tools/include/linux/refcount.h:131:
refcount_sub_and_test: Assertion `!(new > val)' failed.
Aborted (core dumped)
#
The root cause is in
perf_pmus__scan()
+--> perf_pmu__create_placeholder_core_pmu()
+--> cpu_map__online()
cpu_map__online() has a static variable
static struct perf_cpu_map *online;
if (!online)
online = perf_cpu_map__new_online_cpus();
return online;
which is allocated once when entered for the first time.
However perf_pmu__create_placeholder_core_pmu() is actually called
two times.
First time:
run_pmu_scan()
+--> save_result()
+---> perf_pmus__scan_core()
+--> pmu_read_sysfs()
+--> perf_pmu__create_placeholder_core_pmu()
...
+--> perf_pmus__destroy()
Second time:
run_pmu_scan()
+--> perf_pmus__scan()
+--> pmu_read_sysfs()
+--> perf_pmu__create_placeholder_core_pmu()
...
+--> perf_pmus__destroy()
The second time the already allocated memory pointed to by variable
'online' is returned. However in between the first and second call
of perf_pmu__create_placeholder_core_pmu()
function save_result() also frees all PMUs:
save_result()
+--> perf_pmus__destroy()
+--> perf_pmu__delete()
+--> perf_cpu_map__put()
+--> cpu_map__delete()
cpu_map__delete() deletes the perf_cpu_map pointed to by variable
online, but this static variable is not set to NULL. In the second
invocation of perf_pmu__create_placeholder_core_pmu() the same
memory locattion stored in variable online is returned.
Later on run_pmu_scan() calls perf_pmus__destroy() again and then
cpu_map__delete() frees the PMU "cpu->cpus" a second time causing
the core dump.
Avoid core dump and always allocate the online CPUs.
Output after:
# ./perf bench internals pmu-scan
# Running 'internals/pmu-scan' benchmark:
Computing performance of sysfs PMU event scan for 100 times
Average core PMU scanning took: 7.970 usec (+- 0.147 usec)
Average PMU scanning took: 60.415 usec (+- 3.986 usec)
#
Background: s390 z/VM system do not support PMUs for sampling and
counting. In this case dummy events are created by the perf tool
and the PMUs "tool" and "fake" are created and freed.
Fixes: a0c41caebab2f ("perf pmu: Add CPU map for "cpu" PMUs")
Signed-off-by: Thomas Richter <tmricht@xxxxxxxxxxxxx>
Cc: Ian Rogers <irogers@xxxxxxxxxx>
---
tools/perf/util/cpumap.c | 7 +------
1 file changed, 1 insertion(+), 6 deletions(-)
diff --git a/tools/perf/util/cpumap.c b/tools/perf/util/cpumap.c
index 5c329ad614e9..ab9e7a266af9 100644
--- a/tools/perf/util/cpumap.c
+++ b/tools/perf/util/cpumap.c
@@ -691,12 +691,7 @@ size_t cpu_map__snprint_mask(struct perf_cpu_map *map, char *buf, size_t size)
struct perf_cpu_map *cpu_map__online(void) /* thread unsafe */
{
- static struct perf_cpu_map *online;
-
- if (!online)
- online = perf_cpu_map__new_online_cpus(); /* from /sys/devices/system/cpu/online */
-
- return online;
+ return perf_cpu_map__new_online_cpus(); /* from /sys/devices/system/cpu/online */
}
bool aggr_cpu_id__equal(const struct aggr_cpu_id *a, const struct aggr_cpu_id *b)
--
2.48.1