[PATCH v10 00/18] Add perf_env__e_machine and migrate arch string comparisons to e_machine
From: Ian Rogers
Date: Mon Jun 01 2026 - 03:01:37 EST
Add a helper to perf_env to compute the e_machine if it is EM_NONE.
Derive the value from the arch string if available. Similarly derive
the arch string from the ELF machine if available, for
consistency. This means perf's arch (machine type) is no longer
determined by uname but set to match that of the perf ELF executable.
Migrate code away from strcmp on env->arch to using the e_machine
comparisons that are more accurate and not prone to uname and other
naming differences. While cleaning this up, also clean up the capstone
initialization code to cover more architectures and to set the big
endian flag based on ELF header information.
Switch the idle computation to the point of use and lazily compute it,
rather than computing it for every symbol. The current only user is
`perf top`. At the point of use the perf_env is available and this can
be used to make sure the idle function computation is machine and
kernel version dependent.
To avoid concurrent update issues with bitfields sharing a byte in
`struct symbol` switch to using C11 atomics.
v10:
- Key changes in v10:
- **Memory Safety & Robustness Hardening**: Added explicit `machine` NULL-pointer
checks in `dso__load()` and `thread` guards in `thread__e_machine_endian()`
to guarantee segmentation fault protection during out-of-tree Make builds, OOMs,
or orphaned map group trace loadings.
- **Mixed-Bitness Platform Resilience**: Fully overhauled `is_native_compatible()`
to be environment-bitness-aware. This ensures 32-bit hosts correctly utilize
cross-compiler objdump lookups when analyzing 64-bit traces sharing identical
ELF machine constants (`EM_RISCV`).
- **Live Session Architecture Mappings**: Redesigned `perf_env__init_kernel_mode()`
to query runtime `uname(&uts)` and map live `env->arch` states dynamically,
removing fragile hardcoded allowlists and guaranteeing correct 64-bit classifications
for `riscv64`, `mips64`, and `parisc64`. Integrated bitness-aware 64-bit string
overrides for these shared enums inside `perf_env__arch()`.
- **Compiler Optimization Suffix Support**: Replaced strict `strcmp()` logic with an
asymmetric compiler-suffix-aware prefix helper `match_x86_idle_routine()` to safely
allowlist hot idle run-loops containing `.constprop.*` or `.isra.*` extensions while
shielding against active hotplug/management functions dropping.
- **Spectre-mitigated Handler & Cross-Architecture Fallbacks**: Restored the missing
`intel_idle_ibrs` Spectre mitigation idle handler to the allowlist. Configured
`symbol__is_idle()` to safely fall back to reading `dso__e_machine()` if the environment
context is missing, preserving cross-architecture offline trace evaluations.
- **Out-of-tree (`O=`) Make Builds Compatibility**: Refactored generated API translation
units and Make Build dependency rules to eliminate `-Wunused-function` and
`-Wmissing-prototypes` errors under strict `WERROR=1` out-of-tree build parameters.
v9:
- Key changes in v9:
- **C11 Atomics for `struct symbol`**: Dropped the global
`symbol_bits_lock` introduced in v7/v8. Replaced unsafe bitfields
with a thread-safe `_Atomic uint16_t flags` and lockless atomic
helpers (e.g., `symbol__type()`, `symbol__set_inlined()`).
- **Bi-endianness Support**: Added `*_endian` variants for `dso` and
`thread` helpers to ensure Capstone correctly disassembles cross-endian
binaries.
- **Architecture Hardening**:
- Fixed inverted SPARC logic in `perf_env__single_address_space()`.
- Prioritized DSO architecture over global environment in
`machine_or_dso_e_machine()`.
- Fixed an uninitialized memory leak in `perf_env__e_machine()`.
- Removed lossy `normalize_arch()` canonicalization in `process_arch()`.
- Review Feedback Status:
- **Addressed**: C11 atomics migration, bi-endianness, SPARC logic,
DSO prioritization, and uninitialized memory fixes.
- **Not Addressed / Dropped**:
- Patch 15 OS Release: The concern regarding the `uname()` fallback
during offline analysis was determined to be incorrect for these
uninitialized states; the original lazy assumption is retained.
- Patch 04/11: The `EM_AARCH64` fallbacks were dropped as the
definition should come from dwarf-regs.h when necessary.
v8:
- Address Sashiko AI review feedback for Patch 1:
- Switch all code dependent on the arch string to use `e_machine`
instead.
- Update `machine__is` and `machine__normalized_is` to take
`e_machine` integers instead of strings.
- Refactor `arch_syscalls__strerrno_function` to take an `e_machine`.
- Avoid premature caching of the host architecture in
`perf_session__e_machine`.
v7:
- Address better handling of strdup failures with arch in the
header/env.
- Address concurrent update issues in `struct symbol` bitfields by
introducing a global lock for writes.
v6: Ensure arch is canonical by going to e_machine and back (Sashiko)
v5: Add perf_env os_release helper (Namhyung/Sashiko)
v4: Fix Sashiko issues where an array element wasn't sorted properly,
the e_flags weren't returned properly, the idle type is change to
a u8 rather than an enum value and the s390 version check for
psw_idle is slightly reordered and tweaked.
v3: Properly set up the e_machine coming from the perf_env as reported
by Honglei Wang.
v2: Some minor white space clean up.
v1: Initial release.
Ian Rogers (18):
perf env: Add perf_env__e_machine helper and use in perf_env__arch
perf tests topology: Switch env->arch use to env->e_machine
perf env, dso, thread: Add _endian variants for e_machine helpers
perf capstone: Determine architecture from e_machine
perf print_insn: Use e_machine for fallback IP length check
perf symbol: Avoid use of machine__is
perf machine: Use perf_env e_machine rather than arch
perf sample-raw: Use perf_env e_machine rather than arch
perf sort: Use perf_env e_machine rather than arch
perf arch common: Use perf_env e_machine rather than arch
perf header: In print_pmu_caps use perf_env e_machine
perf c2c: Use perf_env e_machine rather than arch
perf lock-contention: Use perf_env e_machine rather than arch
perf env: Refactor perf_env__arch_strerrno
perf env: Remove unused perf_env__raw_arch
perf env: Add helper to lazily compute the os_release
perf symbol: Add setters for bitfields sharing a byte to avoid
concurrent update issues
perf symbol: Lazily compute idle
tools/perf/arch/common.c | 92 +++--
tools/perf/builtin-c2c.c | 40 +-
tools/perf/builtin-inject.c | 2 +-
tools/perf/builtin-kwork.c | 2 +-
tools/perf/builtin-report.c | 2 +-
tools/perf/builtin-sched.c | 4 +-
tools/perf/builtin-top.c | 6 +-
tools/perf/builtin-trace.c | 7 +-
tools/perf/tests/symbols.c | 2 +-
tools/perf/tests/topology.c | 8 +-
tools/perf/tests/vmlinux-kallsyms.c | 2 +-
tools/perf/trace/beauty/Build | 1 +
tools/perf/trace/beauty/arch_errno_names.sh | 41 +-
tools/perf/ui/browsers/annotate.c | 2 +-
tools/perf/ui/browsers/map.c | 4 +-
tools/perf/util/annotate.c | 5 +-
tools/perf/util/auxtrace.c | 6 +-
tools/perf/util/callchain.c | 4 +-
tools/perf/util/capstone.c | 132 ++++---
tools/perf/util/data-convert-bt.c | 2 +-
tools/perf/util/data-convert-json.c | 6 +-
tools/perf/util/dlfilter.c | 2 +-
tools/perf/util/dso.c | 19 +-
tools/perf/util/dso.h | 14 +-
tools/perf/util/env.c | 370 ++++++++++++++----
tools/perf/util/env.h | 13 +-
tools/perf/util/evsel_fprintf.c | 6 +-
tools/perf/util/header.c | 55 ++-
tools/perf/util/intel-pt.c | 2 +-
tools/perf/util/libdw.c | 2 +-
tools/perf/util/lock-contention.c | 6 +-
tools/perf/util/machine.c | 37 +-
tools/perf/util/machine.h | 2 -
tools/perf/util/print_insn.c | 23 +-
tools/perf/util/print_insn.h | 2 +
tools/perf/util/probe-event.c | 4 +-
tools/perf/util/sample-raw.c | 21 +-
tools/perf/util/sample-raw.h | 6 +-
.../util/scripting-engines/trace-event-perl.c | 2 +-
.../scripting-engines/trace-event-python.c | 4 +-
tools/perf/util/session.c | 26 +-
tools/perf/util/sort.c | 66 ++--
tools/perf/util/srcline.c | 10 +-
tools/perf/util/symbol-elf.c | 5 +-
tools/perf/util/symbol.c | 235 ++++++++---
tools/perf/util/symbol.h | 80 +++-
tools/perf/util/symbol_fprintf.c | 4 +-
tools/perf/util/thread.c | 54 ++-
tools/perf/util/thread.h | 23 +-
49 files changed, 1048 insertions(+), 415 deletions(-)
--
2.54.0.823.g6e5bcc1fc9-goog