Re: [PATCH v12 00/19] perf: Use e_machine and lazily compute symbols

From: Namhyung Kim

Date: Wed Jun 03 2026 - 01:40:41 EST


Hi Ian,

On Tue, Jun 02, 2026 at 09:53:59AM -0700, Ian Rogers wrote:
> On Tue, Jun 2, 2026 at 8:25 AM Ian Rogers <irogers@xxxxxxxxxx> wrote:
> >
> > Add a helper to perf_env to compute the e_machine if it is EM_NONE.
> > Derive the value from the arch string if available. Similarly derive
> > the arch string from the ELF machine if available, for consistency.
> > This means perf's arch (machine type) is no longer determined by uname
> > but set to match that of the perf ELF executable.
> >
> > Migrate code away from strcmp on env->arch to using the e_machine
> > comparisons that are more accurate and not prone to uname and other
> > naming differences. While cleaning this up, also clean up the
> > capstone initialization code to cover more architectures and to set
> > the big endian flag based on ELF header information.
> >
> > Refactor perf_env__arch_strerrno to take an e_machine instead of an
> > architecture string, removing the HAVE_LIBTRACEEVENT dependency
> > entirely and making it unconditionally available. The generated errno
> > table includes fallback definitions for newer ELF machine constants to
> > ensure compatibility with older host glibc versions.
> >
> > Introduce a mutex in perf_env to safely protect lazy metadata setup,
> > such as os_release or e_machine resolution, preventing concurrent
> > initialization data races and memory leaks during multi-threaded
> > profiling or symbol loading. Properly initialize stack-allocated
> > perf_env instances to ensure safe mutex destruction.
> >
> > Switch the idle computation to the point of use and lazily compute it,
> > rather than computing it for every symbol. The current only user is
> > `perf top`. At the point of use the perf_env is available and this can
> > be used to make sure the idle function computation correctly accounts
> > for architecture-specific and kernel-version-specific patterns.
> > To prevent concurrent updates to shared symbol bitfield flags, migrate
> > bitfield variables in struct symbol to C11 atomic flags.
>
> So I think this series is at the point where Sashiko [1] is giving
> warnings only for out-of-scope things and pre-existing conditions. I
> will give a detailed explanation below, but I'd appreciate help moving
> this forward with human review and submission. Thanks!
>
> > Ian Rogers (19):
> > perf env: Add perf_env__e_machine helper and use in perf_env__arch
>
> 1 critical 2 high issues.
> The issues relate to existing data races, the inaccurate arch string,
> and normalizing the arch string stored in the data file. The existing
> data races don't bite us currently due to the single threaded nature
> of most of perf - multithreading is on the TODO list. The arch string
> is inaccurate and the e_machine in newer perf.data files resolves
> this. If we were using the arch string without the e_machine then the
> concerns over its use are valid, but this series is trying to remove
> the use of the arch string and strongly prefer the e_machine.
>
> > perf tests topology: Switch env->arch use to env->e_machine
>
> No regressions.
>
> > perf env, dso, thread: Add _endian variants for e_machine helpers
>
> 1 high issue for a potential pre-existing SEGV if a thread lacks maps.
> Let's hope that doesn't happen, the example given assumes a
> multithreaded environment and multi-threading is on the TODO list.
>
> > perf capstone: Determine architecture from e_machine
>
> 1 low issue. A flag only present in capstone 4.0 is used. As capstone
> 4.0 was released in 2018, let's just assume the flag is there rather
> than adding yet more complexity.
>
> > perf print_insn: Use e_machine for fallback IP length check
>
> No regressions.
>
> > perf symbol: Avoid use of machine__is
>
> 1 high issue. Concerns over pre-existing cross-platform analysis
> problems. Cross-platform analysis fully working is on the TODO list.
>
> > perf machine: Use perf_env e_machine rather than arch
> > perf sample-raw: Use perf_env e_machine rather than arch
> > perf sort: Use perf_env e_machine rather than arch
> > perf arch common: Use perf_env e_machine rather than arch
> > perf header: In print_pmu_caps use perf_env e_machine
> > perf c2c: Use perf_env e_machine rather than arch
> > perf lock-contention: Use perf_env e_machine rather than arch
> > perf env: Refactor perf_env__arch_strerrno
> > perf env: Remove unused perf_env__raw_arch
>
> No regressions x9.
>
> > perf env: Add mutex to protect lazy environment initialization
>
> 1 medium issue requesting more locking on more bits of perf_env.
> Multi-threading is on the TODO list and let's stop the feature creep
> here.
>
> > perf env: Add helper to lazily compute the os_release
>
> 1 high issue. Concern over a perf data issue in pipe mode. Addressing
> this would require a fairly major overhail of perf data, so let's add
> fixing to the TODO list.
>
> > perf symbol: Add setters for bitfields sharing a byte to avoid
> > concurrent update issues
> > perf symbol: Lazily compute idle
>
> No regressions x2.

Acked-by: Namhyung Kim <namhyung@xxxxxxxxxx>

Thanks,
Namhyung