[PATCH v17 0/5] perf tools: Add inject --aslr feature, early maps loading, and decoupling fixes
From: Ian Rogers
Date: Sun Jun 07 2026 - 02:09:51 EST
This patch series introduces the new 'perf inject --aslr' feature to
remap virtual memory addresses or drop physical memory event leaks
when profile record data is shared between machines. Bundled with this
feature is a bug fix inside the core map tracking tool that hardens
perf session analysis against concurrent lookup data races.
Detailed Mechanism of MMAP Mapping and ASLR virtual Address Allocation:
The ASLR tool virtualizes the address space of the recorded processes by
intercepting MMAP and MMAP2 events to build a consistent translation
database, which is subsequently used to rewrite sample addresses.
It maintains two primary lookup databases using hash maps:
1. 'remap_addresses': Maps an original mapping key to its new remapped
base address. The key uses topological invariant coordinates:
(machine, dso, invariant). The invariant is computed as (start - pgoff)
for DSO-backed mappings. This invariant remains constant even when
perf's internal overlap-resolution splits a VMA into fragmented
pieces, ensuring split maps resolve consistently back to the same
remapped base.
2. 'top_addresses': Tracks the allocation state per process (machine, pid).
It maintains 'remapped_max' (the highest allocated address in the
virtualized space).
For each MMAP/MMAP2 event:
- We look up the DSO and invariant key in 'remap_addresses'. If found, we
reuse the translation, preserving the offset within the mapping.
- If not found, we allocate a new remapped address space:
- We use thread__find_map to look up the mapping immediately preceding
the new one in the original address space (at start - 1). If
the preceding
mapping was also remapped, we place the new mapping
contiguously after it in the remapped space. This preserves
contiguity of split mappings (e.g., symbols split by HugeTLB,
or anonymous .bss segments adjacent to initialized data).
- If no contiguous mapping is found, we insert a 1-page gap from
the highest allocated address (remapped_max) to prevent accidental
merging of unrelated VMAs.
- The event's start address (and pgoff for kernel maps) is rewritten,
and the event is delegated to the output writer.
To remain strictly conservative and guarantee security, the tool
scrubs breakpoint addresses (bp_addr) from all synthesized stream
headers, completely drops PERF_RECORD_TEXT_POKE events to prevent
absolute immediate pointer operands leaks, and drops unsupported
complex payloads (such as user register stacks, raw tracepoints, and
hardware AUX tracing frames).
Verification is reinforced with shell test ('inject_aslr.sh').
Prerequisite Bug Fix (Patch 1). During development, a core map
indexing issue was identified and resolved to prevent concurrent
lookup data races during session analysis.
Changes since v16:
- Patch 2: Refactored inline ASLR stripping logic out of builtin-inject.c
and into dedicated helpers (aslr_tool__strip_attr_event and
aslr_tool__strip_evlist) in aslr.c to better separate concerns.
- Patch 2: Fixed guest machine allocation memory leak in
aslr_tool__delete() where machines__exit() explicitly skipped freeing
the guest processes tree.
- Patch 3: Fixed bounds-check violations during cross-endian parsing inside
aslr_tool__process_sample() by correctly applying bswap_64() to raw
offsets, iteration counts, sizes, and addresses prior to logical
evaluation when orig_needs_swap is active.
- Patch 4: Fixed pipe mode parser misalignment bug by safely fetching
needs_swap from the initialized evsel rather than blindly intercepting
HEADER_ATTR events prior to session parsing.
- Patch 4: Resolved checkpatch.pl line length warnings in the bswap_64
endianness swapping logic.
- Patch Series: Reordered the final two patches. "perf aslr: Strip
sample registers" is now Patch 4, and "perf test: Add inject ASLR
test" is now Patch 5. This ensures the register stripping logic
is fully introduced before the comprehensive shell tests validate it,
preventing bisectability test failures and easing merge conflicts.
- Patch 5: Fixed "User registers stripping test" starvation when run as
root by explicitly using '-e cycles:u' during recording, preventing
the ring buffer from overflowing with kernel samples.
Changes since v15:
- Patch 2: Added bounds checking for event->header.size before writing
to breakpoint fields to avoid heap buffer overflow on older ABI events.
- Patch 2: Fixed asymmetric calculation bug in aslr_tool__findnew_mapping()
where pgoff for anonymous kernel memory was not properly subtracted upon
insertion, causing the lookup addition to overflow.
- Patch 2: Added detailed comments documenting the symmetric lookup and
insertion math for unmapped and mapped memory blocks.
- Patch 5: Add missing kprobe and uprobe scrubbing of config1 and
config2 during aslr_tool__strip_evlist() to strictly conform with
repipe constraints.
Changes since v14:
- Patch 2: Removed unnecessary vertical whitespace in builtin-inject.c.
- Patch 2: Added comments explaining why pgoff is assigned for
anonymous memory maps to prevent ASLR leaks.
- Patch 2: Removed orig_last_end tracking and refactored contiguous mapping
detection to use thread__find_map(..., start - 1, ...) based on Gabriel's
feedback.
- Patch 2: Scrub kprobe/uprobe event config1 and config2 fields to prevent
address leaks.
- Patch 2: Overwrite pgoff with the remapped start address for anonymous
mappings (detected via is_anon_memory and is_no_dso_memory).
- Patch 3: Fix C90 mixed declaration error for orig_needs_swap.
- Patch 3: Temporarily disable evsel->needs_swap during the secondary
evsel__parse_sample() call to prevent branch stack double-swapping bugs.
Changes since v13:
- Patch 2: Added a NULL check for env before calling
perf_env__kernel_is_64_bit(env) to prevent potential segfaults if the
recorded environment has no headers.
- Patch 5: Fixed sample_size and id_pos going out of sync during
aslr_tool__strip_evlist() and aslr_tool__restore_evlist(). Instead of
using evsel__reset_sample_bit(), which was acting as a no-op due to
early bit clearing and corrupted sample_size, the tool now directly
updates sample_type and recomputes sample_size/id_pos dynamically.
Added orig_sample_size to aslr_evsel_priv to correctly restore the
state.
Changes since v12:
- Patch 2: Fixed potential NULL pointer dereference in
remap_addresses__hash() when handling unmapped memory events (key->dso
is NULL) under REFCNT_CHECKING.
- Patch 2: Dynamically detect machine architecture bitness via
perf_env__kernel_is_64_bit() to select appropriate kernel_space_start
boundaries, avoiding 64-bit address injection on 32-bit platforms.
Changes since v11:
- Patch 1: Fixed struct dso name accessor in maps.c by using
dso__name() instead of ->name.
- Patch 2: Fixed hash function in aslr.c to hash the underlying
dso pointer using RC_CHK_ACCESS to support reference count checking.
Changes since v10:
- Patch 1: Added explicit tracking array logic in maps__load_maps()
to correctly accumulate valid maps (skipping NULL entries after
failures) and safely return the exact populated count, resolving
out-of-bounds pointer iteration panics.
- Patch 3: Fixed endianness bug during cross-endian sample parsing
by passing evsel->needs_swap instead of false to __evsel__parse_sample
in aslr.c, ensuring correct 32-bit field byte unswapping for packed
fields. Refactored evsel__parse_sample to take a needs_swap argument
via __evsel__parse_sample.
- Patch 4: Fixed inject_aslr.sh exit code handling in trap functions
to capture and propagate the correct pipeline failure status code
instead of unconditionally returning success or failing the test.
Changes since v9:
- Patch 1: Added `-ENOMEM` error check inside
`maps__find_symbol_by_name()` and return `NULL` early. Added map
sorting state invalidation on early return in `maps__load_maps()`.
- Patch 2: Fixed encapsulation by using `thread__maps()` and
`thread__pid()` accessors in `aslr_tool__findnew_mapping()`. Added
`pr_warning_once` warning when raw auxtrace data is dropped.
- Patch 3: Fixed encapsulation by using `thread__maps()` and
`thread__pid()` accessors in `aslr_tool__remap_address()`. Wrapped
`evsel__parse_sample()` to temporarily disable `needs_swap` to avoid
branch stack endianness corruption on cross-endian files. Fixed ISO
C90 warning for declaration-after-statement for `orig_needs_swap`.
- Patch 4: Fixed duplicate cleanup by explicitly removing trap
handlers (`trap - EXIT TERM INT`) inside the `cleanup()` function.
- Patch 5: Fixed heap corruption by adding size bounds checking before
writing to `sample_regs_user` and `sample_regs_intr` fields. Added
missing register mask clearing logic for the `itrace` synthesis path
of `perf_event__repipe_attr()`.
Ian Rogers (5):
perf maps: Add maps__mutate_mapping
perf inject/aslr: Add ASLR tool infrastructure and MMAP tracking
perf inject/aslr: Implement sample address remapping
perf aslr: Strip sample registers
perf test: Add inject ASLR test
tools/perf/builtin-inject.c | 50 +-
tools/perf/tests/shell/inject_aslr.sh | 519 +++++++++
tools/perf/util/Build | 1 +
tools/perf/util/aslr.c | 1394 +++++++++++++++++++++++++
tools/perf/util/aslr.h | 44 +
tools/perf/util/evsel.c | 6 +-
tools/perf/util/evsel.h | 10 +-
tools/perf/util/machine.c | 32 +-
tools/perf/util/maps.c | 149 ++-
tools/perf/util/maps.h | 3 +
tools/perf/util/symbol-elf.c | 41 +-
tools/perf/util/symbol.c | 17 +-
12 files changed, 2202 insertions(+), 64 deletions(-)
create mode 100755 tools/perf/tests/shell/inject_aslr.sh
create mode 100644 tools/perf/util/aslr.c
create mode 100644 tools/perf/util/aslr.h
--
2.54.0.1032.g2f8565e1d1-goog