Re: [BUG] perf top reports not being able to resolve kernel symbols

From: Namhyung Kim
Date: Thu Jan 02 2025 - 15:59:05 EST


Hi Arnaldo,

On Thu, Jan 02, 2025 at 04:51:06PM -0300, Arnaldo Carvalho de Melo wrote:
> On Thu, Jan 02, 2025 at 04:25:07PM -0300, Arnaldo Carvalho de Melo wrote:
> > root@number:~# readelf -sw /lib/modules/6.13.0-rc2/build/vmlinux | grep -B5 -A5 ' 0000000001600'
> > 259227: ffffffff8156e290 262 FUNC GLOBAL DEFAULT 1 zs_free
> > 259228: ffffffff8183a4d0 269 FUNC GLOBAL DEFAULT 1 security_inode_g[...]
> > 259229: ffffffff81c8d900 191 FUNC GLOBAL DEFAULT 1 devres_find
> > 259230: ffffffff812e11c0 16 FUNC GLOBAL DEFAULT 1 __pfx___probestu[...]
> > 259231: ffffffff81c985a0 16 FUNC GLOBAL DEFAULT 1 __pfx_pm_qos_sys[...]
> > 259232: 0000000001600000 0 NOTYPE GLOBAL DEFAULT ABS text_size
> > 259233: ffffffff81487f10 117 FUNC GLOBAL DEFAULT 1 shmem_read_folio_gfp
> > 259234: ffffffff81e08540 155 FUNC GLOBAL DEFAULT 1 __traceiter_smbu[...]
> > 259235: ffffffff811e13a0 16 FUNC GLOBAL DEFAULT 1 __pfx_thaw_workqueues
> > 259236: ffffffff81b04c70 599 FUNC GLOBAL DEFAULT 1 acpi_install_method
> > 259237: ffffffff81de7d40 16 FUNC GLOBAL DEFAULT 1 __pfx_psmouse_se[...]
> > root@number:~#
>
> > There it is, that "text_size" symbol stayed with with a prev->end equal
> > to prev->start and thus 0x00000000016001c1 stops being resolved, which
> > leads us to get to that buggy warning.
>
> > I'll put all this into a patch and send it for review,
>
> But looking further, where do those 0x00000000016001c1 addresses coming
> from?
>
> (gdb) p /x sample->ip
> $10 = 0xffffffffb7401fad
> (gdb) p /x al->addr
> $11 = 0x1601fad
> (gdb) bt
> #0 perf_event__process_sample (tool=0x7fffffff9bd0, event=0x1017400, evsel=0xf68860, sample=0x7fff8dffa470, machine=0xf8e818) at builtin-top.c:813
> #1 0x0000000000447c5c in deliver_event (qe=0x7fffffff9ee8, qevent=0x1024670) at builtin-top.c:1213
> #2 0x0000000000642706 in do_flush (oe=0x7fffffff9ee8, show_progress=false) at util/ordered-events.c:245
> #3 0x0000000000642a5d in __ordered_events__flush (oe=0x7fffffff9ee8, how=OE_FLUSH__TOP, timestamp=0) at util/ordered-events.c:324
> #4 0x0000000000642b47 in ordered_events__flush (oe=0x7fffffff9ee8, how=OE_FLUSH__TOP) at util/ordered-events.c:342
> #5 0x00000000004477e9 in process_thread (arg=0x7fffffff9bd0) at builtin-top.c:1125
> #6 0x00007ffff6ea5d97 in start_thread () from /lib64/libc.so.6
> #7 0x00007ffff6f29c8c in clone3 () from /lib64/libc.so.6
> (gdb)
>
> root@number:~# grep ffffffffb7401f /proc/kallsyms
> ffffffffb7401f09 t repeat_nmi
> ffffffffb7401f2e t end_repeat_nmi
> ffffffffb7401f81 t nmi_no_fsgsbase
> ffffffffb7401f85 t nmi_swapgs
> ffffffffb7401f88 t nmi_restore
> ffffffffb7401fb0 T entry_SYSCALL32_ignore
> ffffffffb7401fd0 T __pfx_clear_bhb_loop
> ffffffffb7401fe0 T clear_bhb_loop
> root@number:~#
>
> Looks like nmi_restore...
>
> Which is...
>
> 780: ffffffff82401ee8 0 NOTYPE LOCAL DEFAULT 1 nested_nmi_out
> 781: ffffffff82401ed0 0 NOTYPE LOCAL DEFAULT 1 nested_nmi
> 782: ffffffff82401eeb 0 NOTYPE LOCAL DEFAULT 1 first_nmi
> 783: ffffffff82401f81 0 NOTYPE LOCAL DEFAULT 1 nmi_no_fsgsbase
> 784: ffffffff82401f88 0 NOTYPE LOCAL DEFAULT 1 nmi_restore
> 785: ffffffff82401f85 0 NOTYPE LOCAL DEFAULT 1 nmi_swapgs
> 786: 0000000000000000 0 FILE LOCAL DEFAULT ABS syscall_64.c
> 787: 0000000000000000 0 FILE LOCAL DEFAULT ABS common.c
> 788: ffffffff810cc2b0 16 FUNC LOCAL DEFAULT 1 ia32_emulation_o[...]
> 789: ffffffff821e57f0 241 FUNC LOCAL DEFAULT 1 __do_fast_syscall_32
>
> So there are symbols that are not being resolved anymore that were
> before your patch, namely:
>
> arch/x86/entry/entry_64.S
>
> nmi_no_fsgsbase:
> /* EBX == 0 -> invoke SWAPGS */
> testl %ebx, %ebx
> jnz nmi_restore
>
> nmi_swapgs:
> swapgs
>
> nmi_restore:
> POP_REGS
>

Sorry about that, maybe I should've done this instead. Can you check
if it works correctly?

Thanks,
Namhyung

---8<---