[PATCH v7 0/5] Reference count checker and related fixes

From: Ian Rogers
Date: Fri Apr 07 2023 - 19:04:36 EST


The perf tool has a class of memory problems where reference counts
are used incorrectly. Memory/address sanitizers and valgrind don't
provide useful ways to debug these problems, you see a memory leak
where the only pertinent information is the original allocation
site. What would be more useful is knowing where a get fails to have a
corresponding put, where there are double puts, etc.

This work was motivated by the roll-back of:
https://lore.kernel.org/linux-perf-users/20211118193714.2293728-1-irogers@xxxxxxxxxx/
where fixing a missed put resulted in a use-after-free in a different
context. There was a sense in fixing the issue that a game of
wac-a-mole had been embarked upon in adding missed gets and puts.

The basic approach of the change is to add a level of indirection at
the get and put calls. Get allocates a level of indirection that, if
no corresponding put is called, becomes a memory leak (and associated
stack trace) that leak sanitizer can report. Similarly if two puts are
called for the same get, then a double free can be detected by address
sanitizer. This can also detect the use after put, which should also
yield a segv without a sanitizer.

Adding reference count checking to cpu map was done as a proof of
concept, it yielded little other than a location where the use of get
could be cleaner by using its result. Reference count checking on
nsinfo identified a double free of the indirection layer and the
related threads, thereby identifying a data race as discussed here:
https://lore.kernel.org/linux-perf-users/CAP-5=fWZH20L4kv-BwVtGLwR=Em3AOOT+Q4QGivvQuYn5AsPRg@xxxxxxxxxxxxxx/
Accordingly the dso->lock was extended and use to cover the race.

The v3 version addresses problems in v2, in particular using macros to
avoid #ifdefs. The v3 version applies the reference count checking
approach to two more data structures, maps and map. While maps was
straightforward, struct map showed a problem where reference counted
thing can be on lists and rb-trees that are oblivious to the
reference count. To sanitize this, struct map is changed so that it is
referenced by either a list or rb-tree node and not part of it. This
simplifies the reference count and the patches have caught and fixed a
number of missed or mismatched reference counts relating to struct
map.

A wider discussion of the approach is on the mailing list:
https://lore.kernel.org/linux-perf-users/YffqnynWcc5oFkI5@xxxxxxxxxx/T/#mf25ccd7a2e03de92cec29d36e2999a8ab5ec7f88
Comparing it to a past approach:
https://lore.kernel.org/all/20151209021047.10245.8918.stgit@localhost.localdomain/
and to ref_tracker:
https://lwn.net/Articles/877603/

v7. rebase on top of merged and Arnaldo fixed changes. The remaining 5
patches no longer refactor APIs but just add the necessary
reference count checking macros and usage.
v6. rebase removing 5 merged changes. Fix missed issues with libunwind.
v5. rebase removing 5 merged changes. Add map_list_node__new to the
1st patch (perf map: Move map list node into symbol) as suggested
by Arnaldo. Remove unnecessary map__puts from patch 12 (perf map:
Changes to reference counting) as suggested by Adrian.
v4. rebases on to acme's perf-tools-next, fixes more issues with
map/maps and breaks apart the accessor functions to reduce
individual patch sizes. The accessor functions are mechanical
changes where the single biggest one is refactoring use of
map->dso to be map__dso(map).

Ian Rogers (5):
libperf: Add reference count checking macros.
perf cpumap: Add reference count checking
perf namespaces: Add reference count checking
perf maps: Add reference count checking.
perf map: Add reference count checking

tools/lib/perf/Makefile | 2 +-
tools/lib/perf/cpumap.c | 94 ++++++++-------
tools/lib/perf/include/internal/cpumap.h | 4 +-
tools/lib/perf/include/internal/rc_check.h | 94 +++++++++++++++
tools/perf/builtin-inject.c | 2 +-
tools/perf/builtin-top.c | 4 +-
tools/perf/tests/cpumap.c | 4 +-
tools/perf/tests/hists_link.c | 2 +-
tools/perf/tests/maps.c | 20 ++--
tools/perf/tests/thread-maps-share.c | 29 ++---
tools/perf/tests/vmlinux-kallsyms.c | 4 +-
tools/perf/util/annotate.c | 2 +-
tools/perf/util/cpumap.c | 40 +++----
tools/perf/util/dso.c | 2 +-
tools/perf/util/dsos.c | 2 +-
tools/perf/util/machine.c | 27 +++--
tools/perf/util/map.c | 69 ++++++-----
tools/perf/util/map.h | 32 ++---
tools/perf/util/maps.c | 64 +++++-----
tools/perf/util/maps.h | 17 +--
tools/perf/util/namespaces.c | 132 ++++++++++++---------
tools/perf/util/namespaces.h | 3 +-
tools/perf/util/pmu.c | 8 +-
tools/perf/util/symbol-elf.c | 26 ++--
tools/perf/util/symbol.c | 55 +++++----
tools/perf/util/unwind-libdw.c | 2 +-
tools/perf/util/unwind-libunwind-local.c | 2 +-
tools/perf/util/unwind-libunwind.c | 2 +-
28 files changed, 448 insertions(+), 296 deletions(-)
create mode 100644 tools/lib/perf/include/internal/rc_check.h

--
2.40.0.577.gac1e443424-goog