Re: [PATCH v12 00/19] perf: Use e_machine and lazily compute symbols
From: Ian Rogers
Date: Tue Jun 02 2026 - 13:07:27 EST
On Tue, Jun 2, 2026 at 8:25 AM Ian Rogers <irogers@xxxxxxxxxx> wrote:
>
> Add a helper to perf_env to compute the e_machine if it is EM_NONE.
> Derive the value from the arch string if available. Similarly derive
> the arch string from the ELF machine if available, for consistency.
> This means perf's arch (machine type) is no longer determined by uname
> but set to match that of the perf ELF executable.
>
> Migrate code away from strcmp on env->arch to using the e_machine
> comparisons that are more accurate and not prone to uname and other
> naming differences. While cleaning this up, also clean up the
> capstone initialization code to cover more architectures and to set
> the big endian flag based on ELF header information.
>
> Refactor perf_env__arch_strerrno to take an e_machine instead of an
> architecture string, removing the HAVE_LIBTRACEEVENT dependency
> entirely and making it unconditionally available. The generated errno
> table includes fallback definitions for newer ELF machine constants to
> ensure compatibility with older host glibc versions.
>
> Introduce a mutex in perf_env to safely protect lazy metadata setup,
> such as os_release or e_machine resolution, preventing concurrent
> initialization data races and memory leaks during multi-threaded
> profiling or symbol loading. Properly initialize stack-allocated
> perf_env instances to ensure safe mutex destruction.
>
> Switch the idle computation to the point of use and lazily compute it,
> rather than computing it for every symbol. The current only user is
> `perf top`. At the point of use the perf_env is available and this can
> be used to make sure the idle function computation correctly accounts
> for architecture-specific and kernel-version-specific patterns.
> To prevent concurrent updates to shared symbol bitfield flags, migrate
> bitfield variables in struct symbol to C11 atomic flags.
So I think this series is at the point where Sashiko [1] is giving
warnings only for out-of-scope things and pre-existing conditions. I
will give a detailed explanation below, but I'd appreciate help moving
this forward with human review and submission. Thanks!
> Ian Rogers (19):
> perf env: Add perf_env__e_machine helper and use in perf_env__arch
1 critical 2 high issues.
The issues relate to existing data races, the inaccurate arch string,
and normalizing the arch string stored in the data file. The existing
data races don't bite us currently due to the single threaded nature
of most of perf - multithreading is on the TODO list. The arch string
is inaccurate and the e_machine in newer perf.data files resolves
this. If we were using the arch string without the e_machine then the
concerns over its use are valid, but this series is trying to remove
the use of the arch string and strongly prefer the e_machine.
> perf tests topology: Switch env->arch use to env->e_machine
No regressions.
> perf env, dso, thread: Add _endian variants for e_machine helpers
1 high issue for a potential pre-existing SEGV if a thread lacks maps.
Let's hope that doesn't happen, the example given assumes a
multithreaded environment and multi-threading is on the TODO list.
> perf capstone: Determine architecture from e_machine
1 low issue. A flag only present in capstone 4.0 is used. As capstone
4.0 was released in 2018, let's just assume the flag is there rather
than adding yet more complexity.
> perf print_insn: Use e_machine for fallback IP length check
No regressions.
> perf symbol: Avoid use of machine__is
1 high issue. Concerns over pre-existing cross-platform analysis
problems. Cross-platform analysis fully working is on the TODO list.
> perf machine: Use perf_env e_machine rather than arch
> perf sample-raw: Use perf_env e_machine rather than arch
> perf sort: Use perf_env e_machine rather than arch
> perf arch common: Use perf_env e_machine rather than arch
> perf header: In print_pmu_caps use perf_env e_machine
> perf c2c: Use perf_env e_machine rather than arch
> perf lock-contention: Use perf_env e_machine rather than arch
> perf env: Refactor perf_env__arch_strerrno
> perf env: Remove unused perf_env__raw_arch
No regressions x9.
> perf env: Add mutex to protect lazy environment initialization
1 medium issue requesting more locking on more bits of perf_env.
Multi-threading is on the TODO list and let's stop the feature creep
here.
> perf env: Add helper to lazily compute the os_release
1 high issue. Concern over a perf data issue in pipe mode. Addressing
this would require a fairly major overhail of perf data, so let's add
fixing to the TODO list.
> perf symbol: Add setters for bitfields sharing a byte to avoid
> concurrent update issues
> perf symbol: Lazily compute idle
No regressions x2.
Thanks,
Ian
[1] https://sashiko.dev/#/patchset/20260602152516.2831152-1-irogers%40google.com
>
> tools/perf/arch/common.c | 92 +++--
> tools/perf/builtin-c2c.c | 40 +-
> tools/perf/builtin-inject.c | 10 +-
> tools/perf/builtin-kwork.c | 2 +-
> tools/perf/builtin-report.c | 2 +-
> tools/perf/builtin-sched.c | 4 +-
> tools/perf/builtin-top.c | 7 +-
> tools/perf/builtin-trace.c | 7 +-
> tools/perf/tests/symbols.c | 2 +-
> tools/perf/tests/topology.c | 8 +-
> tools/perf/tests/vmlinux-kallsyms.c | 2 +-
> tools/perf/trace/beauty/Build | 1 +
> tools/perf/trace/beauty/arch_errno_names.sh | 53 ++-
> tools/perf/ui/browsers/annotate.c | 2 +-
> tools/perf/ui/browsers/map.c | 4 +-
> tools/perf/util/annotate.c | 5 +-
> tools/perf/util/auxtrace.c | 6 +-
> tools/perf/util/callchain.c | 4 +-
> tools/perf/util/capstone.c | 132 ++++--
> tools/perf/util/data-convert-bt.c | 2 +-
> tools/perf/util/data-convert-json.c | 6 +-
> tools/perf/util/dlfilter.c | 2 +-
> tools/perf/util/dso.c | 19 +-
> tools/perf/util/dso.h | 14 +-
> tools/perf/util/env.c | 376 ++++++++++++++----
> tools/perf/util/env.h | 14 +-
> tools/perf/util/evsel_fprintf.c | 6 +-
> tools/perf/util/header.c | 55 ++-
> tools/perf/util/intel-pt.c | 2 +-
> tools/perf/util/libdw.c | 2 +-
> tools/perf/util/lock-contention.c | 6 +-
> tools/perf/util/machine.c | 37 +-
> tools/perf/util/machine.h | 2 -
> tools/perf/util/print_insn.c | 23 +-
> tools/perf/util/print_insn.h | 2 +
> tools/perf/util/probe-event.c | 4 +-
> tools/perf/util/sample-raw.c | 21 +-
> tools/perf/util/sample-raw.h | 6 +-
> .../util/scripting-engines/trace-event-perl.c | 2 +-
> .../scripting-engines/trace-event-python.c | 4 +-
> tools/perf/util/session.c | 26 +-
> tools/perf/util/sort.c | 66 +--
> tools/perf/util/srcline.c | 10 +-
> tools/perf/util/symbol-elf.c | 5 +-
> tools/perf/util/symbol.c | 238 ++++++++---
> tools/perf/util/symbol.h | 80 +++-
> tools/perf/util/symbol_fprintf.c | 4 +-
> tools/perf/util/thread.c | 58 ++-
> tools/perf/util/thread.h | 23 +-
> 49 files changed, 1078 insertions(+), 420 deletions(-)
>
> --
> 2.54.0.929.g9b7fa37559-goog
>