Re: [PATCH v3] perf: Add is_mapping_symbol() helper for kernel mapping symbol filtering

From: Ian Rogers

Date: Thu May 07 2026 - 11:24:11 EST


On Thu, May 7, 2026 at 12:11 AM Rui Qi <qirui.001@xxxxxxxxxxxxx> wrote:
>
> The perf tool currently has ad-hoc logic to filter out ELF mapping
> symbols scattered across multiple files. ARM, AArch64 and RISC-V each
> have their own inline checks in dso__load_sym_internal(), and kallsym
> processing has yet another check for ARM module symbols.
>
> This is fragile: adding support for a new architecture or adjusting
> which prefixes are considered mapping symbols requires touching
> multiple places, and it is easy for the checks to diverge. It also
> does not match the kernel's own is_mapping_symbol() logic, which
> additionally covers x86 local symbols (".L*" and "L0*").
>
> Introduce a single is_mapping_symbol() inline helper in symbol.h and
> convert all kernel symbol handling to use it. The helper covers the
> existing "$" prefix used by ARM, AArch64 and RISC-V, and also adds
> the x86 local symbol prefixes so that perf stays consistent with
> the kernel.
>
> Signed-off-by: Rui Qi <qirui.001@xxxxxxxxxxxxx>
> ---
> Changes in v3:
> - Add is_mapping_symbol() check for kernel modules in dso__load_sym_internal()
> - Add is_mapping_symbol() check in machine__process_ksymbol_unregister()
>
> Link (v2): https://lore.kernel.org/all/20260506073820.2419087-1-qirui.001@xxxxxxxxxxxxx/
>
> Changes in v2:
> - Only apply is_mapping_symbol() filtering to kernel symbols (kallsyms
> and ksymbol events), not to user-space symbols from ELF files,
> BFD libraries, or perf map files. This avoids incorrectly
> discarding valid user-space function names that start with '$',
> which is a legal character in identifiers for many languages
> (e.g., Java, Scala) and compilers (GCC).
> - Move the mapping symbol check in machine__process_ksymbol_register()
> to the beginning of the function, before any map/dso allocation
> or insertion, to avoid leaving empty maps in the kernel map tree.
>
> Link (v1): https://lore.kernel.org/all/20260504090609.1801880-1-qirui.001@xxxxxxxxxxxxx/
> ---
> tools/perf/util/machine.c | 12 +++++++++++-
> tools/perf/util/symbol-elf.c | 8 ++++++++
> tools/perf/util/symbol.c | 4 ++--
> tools/perf/util/symbol.h | 15 +++++++++++++++
> 4 files changed, 36 insertions(+), 3 deletions(-)
>
> diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
> index e76f8c86e62a..4e33ba06111d 100644
> --- a/tools/perf/util/machine.c
> +++ b/tools/perf/util/machine.c
> @@ -729,9 +729,15 @@ static int machine__process_ksymbol_register(struct machine *machine,
> {
> struct symbol *sym;
> struct dso *dso = NULL;
> - struct map *map = maps__find(machine__kernel_maps(machine), event->ksymbol.addr);
> + struct map *map;
> int err = 0;
>
> + /* Ignore mapping symbols in ksymbol events - check early before any state mutation */
> + if (is_mapping_symbol(event->ksymbol.name))
> + return 0;
> +
> + map = maps__find(machine__kernel_maps(machine), event->ksymbol.addr);
> +
> if (!map) {
> dso = dso__new(event->ksymbol.name);
>
> @@ -790,6 +796,10 @@ static int machine__process_ksymbol_unregister(struct machine *machine,
> struct symbol *sym;
> struct map *map;
>
> + /* Ignore mapping symbols in ksymbol events */
> + if (is_mapping_symbol(event->ksymbol.name))
> + return 0;
> +
> map = maps__find(machine__kernel_maps(machine), event->ksymbol.addr);
> if (!map)
> return 0;
> diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c
> index 7afa8a117139..6b12508ea58d 100644
> --- a/tools/perf/util/symbol-elf.c
> +++ b/tools/perf/util/symbol-elf.c
> @@ -1607,6 +1607,14 @@ dso__load_sym_internal(struct dso *dso, struct map *map, struct symsrc *syms_ss,
> continue;
> }
>
> + /*
> + * For kernel modules, also reject x86 local symbols (.L* and L0*)
> + * to match the kernel's is_mapping_symbol() logic and kallsyms
> + * parsing behavior.
> + */
> + if (kmodule && is_mapping_symbol(elf_name))
> + continue;
> +
> if (runtime_ss->opdsec && sym.st_shndx == runtime_ss->opdidx) {
> u32 offset = sym.st_value - syms_ss->opdshdr.sh_addr;
> u64 *opd = opddata->d_buf + offset;
> diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
> index fcaeeddbbb6b..af03b16c17c6 100644
> --- a/tools/perf/util/symbol.c
> +++ b/tools/perf/util/symbol.c
> @@ -770,8 +770,8 @@ static int map__process_kallsym_symbol(void *arg, const char *name,
> if (!symbol_type__filter(type))
> return 0;
>
> - /* Ignore local symbols for ARM modules */
> - if (name[0] == '$')
> + /* Ignore mapping symbols in kallsyms */
> + if (is_mapping_symbol(name))
> return 0;
>
> /*
> diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
> index bd6eb90c8668..27fa1b43e6f1 100644
> --- a/tools/perf/util/symbol.h
> +++ b/tools/perf/util/symbol.h
> @@ -28,6 +28,21 @@ struct maps;
> struct option;
> struct build_id;
>
> +/*
> + * Ignore kernel mapping symbols, matching kernel is_mapping_symbol() logic.
> + * This checks for '$' prefix (used by ARM, AArch64, RISC-V) and
> + * x86 local symbol prefixes (.L* and L0*).
> + * Only use this for kernel symbols (kallsyms, ksymbol events).
> + */
> +static inline bool is_mapping_symbol(const char *str)

Is there a good reference for what is meant by "mapping symbol" ?
Would "local symbol" be more appropriate for x86? On ARM it seems the
term is well defined:
https://developer.arm.com/documentation/dui0803/a/Accessing-and-managing-symbols-with-armlink/About-mapping-symbols
I'm wondering if we can make this more intention-revealing. I'm
wondering also if we should make the check dependent on the e_machine,
so perhaps:
```
static inline is_ignored_kernel_symbol(const char *name, uint16_t e_machine)
{
if (e_machine == EM_386 || e_machine == EM_X86_64) {
/* Local symbols on x86 may start .L or L0. */
return(str[0] == '.' && str[1] == 'L') || (str[0] == 'L' && str[1] == '0';
}
/* All other machine types. Assume symbols starting $ are mapping
symbols used to denote transitions between different sections of data
and code. */
return str[0] == '$';
}
```
I think you can use something like `dso__e_machine` at all call sites
to get the ELF machine type for the binary or kernel, but maybe this
is overkill.

Thanks,
Ian

> +{
> + if (str[0] == '.' && str[1] == 'L')
> + return true;
> + if (str[0] == 'L' && str[1] == '0')
> + return true;
> + return str[0] == '$';
> +}
> +
> /*
> * libelf 0.8.x and earlier do not support ELF_C_READ_MMAP;
> * for newer versions we can use mmap to reduce memory usage:
> --
> 2.20.1