Re: [PATCH v2 00/13] dso/dsos memory savings and clean up

From: Namhyung Kim
Date: Mon Mar 25 2024 - 17:03:46 EST


Hi Ian,

On Thu, Mar 21, 2024 at 9:03 AM Ian Rogers <irogers@xxxxxxxxxx> wrote:
>
> 13 more patches from:
> https://lore.kernel.org/lkml/20240202061532.1939474-1-irogers@xxxxxxxxxx/
> a near half year old adventure in trying to lower perf's dynamic
> memory use. Bits like the memory overhead of opendir are on the
> sidelines for now, too much fighting over how
> distributions/C-libraries present getdents. These changes are more
> good old fashioned replace an rb-tree with a sorted array and add
> reference count tracking.
>
> The changes migrate dsos code, the collection of dso structs, more
> into the dsos.c/dsos.h files. As with maps and threads, this is done
> so the internals can be changed - replacing a linked list (for fast
> iteration) and an rb-tree (for fast finds) with a lazily sorted
> array. The complexity of operations remain roughly the same, although
> iterating an array is likely faster than iterating a linked list, the
> memory usage is at least reduce by half.
>
> As fixing the memory usage necessitates changing operations like find,
> modify these operations so that they increment the reference count to
> avoid races like a find in dsos and a remove. Similarly tighten up
> lock usage so that operations working on dsos state hold the
> appropriate lock.
>
> Here are some questions (with answers) that I am expecting from reviewers:
>
> - Why not refactor dso with accessors first and then do the other things?
>
> My ambition with this change was to lower memory overhead not to
> particularly clean up and fix dso. Fixing the memory overhead, by
> refactoring and changing the internals, showed that locking discipline
> and reference counting discipline was lacking. The later changes try
> to fix these as a service to the community while I am changing the
> code and to also ensure that code is correct (more correct than it was
> wrt locking and reference counting than before the patches).
>
> Reordering the patches to do the refactoring first will be a giant
> pain. It will merge conflict with every other patch in the series and
> is basically a request to reimplement everything from square 1. The
> only thing I'd have in my favor would be how the code should look at
> the end of the series, and reordering patches doesn't change the
> eventual outcome of applying the patches. Note also, were I to send
> the memory saving patches and then a week later send the API clean up
> and reference counting fix patches the patches would be merged in the
> order they are here. I've done my best, I know you may consider that
> I'm adding to your reviewing overhead but I've also got to think about
> the overhead to me.
>
> - Please break apart this change...
>
> The first changes are moving things, but when a broken API is spotted
> like the missing get on dsos__find I put it in a change to move the
> function and to add the missed get. Could this be two changes? Yes, it
> could. Does moving code materially change the behavior of the tool?
> No. I've done it in one patch to minimize churn and to some extent for
> my sanity. Such changes are less than 100 lines of code and all
> independently tested.
>
> - The logic in dso around short, long name and id with sorting is weird
>
> Yes, I've tried to make it less weird while retaining the existing
> behavior. It would be easy to make a series of patches just cleaning
> it up but I came here to save memory not change the dso API.
>
> - Move the fixes in the 12th patch earlier.
>
> This is possible but then impossible to test with reference count
> checking. This does mean there are broken reference counts before the
> patch is applied, but this is generally already the case. Yes, some
> hypothetical person may decide to fork midway through this patch
> series and my order would mean they wouldn't have a fix. I've done my
> best while working within the bounds of my time and trying to avoid
> churn.
>
> v2. Rebases on top of tmp.perf-tools-next resolving merge conflicts.
>
> Ian Rogers (13):
> perf dso: Reorder variables to save space in struct dso
> perf dsos: Attempt to better abstract dsos internals
> perf dsos: Tidy reference counting and locking
> perf dsos: Add dsos__for_each_dso
> perf dso: Move dso functions out of dsos
> perf dsos: Switch more loops to dsos__for_each_dso
> perf dsos: Switch backing storage to array from rbtree/list
> perf dsos: Remove __dsos__addnew
> perf dsos: Remove __dsos__findnew_link_by_longname_id
> perf dsos: Switch hand code to bsearch
> perf dso: Add reference count checking and accessor functions
> perf dso: Reference counting related fixes
> perf dso: Use container_of to avoid a pointer in dso_data

Acked-by: Namhyung Kim <namhyung@xxxxxxxxxx>

Thanks,
Namhyung

>
> tools/perf/builtin-annotate.c | 8 +-
> tools/perf/builtin-buildid-cache.c | 2 +-
> tools/perf/builtin-buildid-list.c | 18 +-
> tools/perf/builtin-inject.c | 96 +--
> tools/perf/builtin-kallsyms.c | 2 +-
> tools/perf/builtin-mem.c | 4 +-
> tools/perf/builtin-record.c | 2 +-
> tools/perf/builtin-report.c | 6 +-
> tools/perf/builtin-script.c | 8 +-
> tools/perf/builtin-top.c | 4 +-
> tools/perf/builtin-trace.c | 2 +-
> tools/perf/tests/code-reading.c | 8 +-
> tools/perf/tests/dso-data.c | 67 ++-
> tools/perf/tests/hists_common.c | 6 +-
> tools/perf/tests/hists_cumulate.c | 4 +-
> tools/perf/tests/hists_output.c | 2 +-
> tools/perf/tests/maps.c | 4 +-
> tools/perf/tests/symbols.c | 8 +-
> tools/perf/tests/vmlinux-kallsyms.c | 6 +-
> tools/perf/ui/browsers/annotate.c | 6 +-
> tools/perf/ui/browsers/hists.c | 8 +-
> tools/perf/ui/browsers/map.c | 4 +-
> tools/perf/util/annotate-data.c | 14 +-
> tools/perf/util/annotate.c | 47 +-
> tools/perf/util/auxtrace.c | 2 +-
> tools/perf/util/block-info.c | 2 +-
> tools/perf/util/bpf-event.c | 8 +-
> tools/perf/util/build-id.c | 136 ++---
> tools/perf/util/build-id.h | 2 -
> tools/perf/util/callchain.c | 2 +-
> tools/perf/util/data-convert-json.c | 2 +-
> tools/perf/util/db-export.c | 6 +-
> tools/perf/util/dlfilter.c | 12 +-
> tools/perf/util/dso.c | 472 +++++++++------
> tools/perf/util/dso.h | 554 +++++++++++++++---
> tools/perf/util/dsos.c | 529 +++++++++++------
> tools/perf/util/dsos.h | 40 +-
> tools/perf/util/event.c | 8 +-
> tools/perf/util/header.c | 8 +-
> tools/perf/util/hist.c | 4 +-
> tools/perf/util/intel-pt.c | 22 +-
> tools/perf/util/machine.c | 192 ++----
> tools/perf/util/machine.h | 2 +
> tools/perf/util/map.c | 82 ++-
> tools/perf/util/maps.c | 14 +-
> tools/perf/util/probe-event.c | 25 +-
> .../util/scripting-engines/trace-event-perl.c | 6 +-
> .../scripting-engines/trace-event-python.c | 21 +-
> tools/perf/util/session.c | 21 +
> tools/perf/util/session.h | 2 +
> tools/perf/util/sort.c | 19 +-
> tools/perf/util/srcline.c | 65 +-
> tools/perf/util/symbol-elf.c | 145 +++--
> tools/perf/util/symbol-minimal.c | 4 +-
> tools/perf/util/symbol.c | 186 +++---
> tools/perf/util/symbol_fprintf.c | 4 +-
> tools/perf/util/synthetic-events.c | 24 +-
> tools/perf/util/thread.c | 4 +-
> tools/perf/util/unwind-libunwind-local.c | 18 +-
> tools/perf/util/unwind-libunwind.c | 2 +-
> tools/perf/util/vdso.c | 56 +-
> 61 files changed, 1817 insertions(+), 1220 deletions(-)
>
> --
> 2.44.0.396.g6e790dbe36-goog
>