perf's handling of unfindable user symbols...

From: David Miller
Date: Sun Oct 14 2018 - 03:42:45 EST



Perf has this hack where it uses the kernel symbol map as a backup
when a symbol can't be found in the user's symbol table(s).

This causes problems because the tests driving this code path use
machine__kernel_ip(), and that is completely meaningless on Sparc. On
sparc64 the kernel and user live in physically separate virtual
address spaces, rather than a shared one. And the kernel lives at a
virtual address that overlaps common userspace addresses. So this
test passes almost all the time when a user symbol lookup fails.

The consequence of this is that, if the unfound user virtual address
in the sample doesn't match up to a kernel symbol either, we trigger
things like this code in builtin-top.c:

if (al.sym == NULL && al.map != NULL) {
const char *msg = "Kernel samples will not be resolved.\n";
/*
* As we do lazy loading of symtabs we only will know if the
* specified vmlinux file is invalid when we actually have a
* hit in kernel space and then try to load it. So if we get
* here and there are _no_ symbols in the DSO backing the
* kernel map, bail out.
*
* We may never get here, for instance, if we use -K/
* --hide-kernel-symbols, even if the user specifies an
* invalid --vmlinux ;-)
*/
if (!machine->kptr_restrict_warned && !top->vmlinux_warned &&
__map__is_kernel(al.map) && map__has_symbols(al.map)) {
if (symbol_conf.vmlinux_name) {
char serr[256];
dso__strerror_load(al.map->dso, serr, sizeof(serr));
ui__warning("The %s file can't be used: %s\n%s",
symbol_conf.vmlinux_name, serr, msg);
} else {
ui__warning("A vmlinux file was not found.\n%s",
msg);
}

if (use_browser <= 0)
sleep(5);
top->vmlinux_warned = true;
}
}

When I fire up a compilation on sparc, this triggers immediately.

I'm trying to figure out what the "backup to kernel map" code is
accomplishing.

I see some language in the current code and in the changes that have
happened in this area talking about vdso. Does that really happen?

The vdso is mapped into userspace virtual addresses, not kernel ones.

More history. This didn't cause problems on sparc some time ago,
because the kernel IP check used to be "ip < 0" :-) Sparc kernel
addresses are not negative. But now with machine__kernel_ip(), which
works using the symbol table determined kernel address range, it does
trigger.

What it all boils down to is that on architectures like sparc,
machine__kernel_ip() should always return false in this scenerio, and
therefore this kind of logic:

if (cpumode == PERF_RECORD_MISC_USER && machine &&
mg != &machine->kmaps &&
machine__kernel_ip(machine, al->addr)) {

is basically invalid. PERF_RECORD_MISC_USER implies no kernel address
can possibly match for the sample/event in question (no matter how
hard you try!) :-)

At the very least there should be an arch way to disable this logic,
or even formalize in some way the situation. Have some kind of
"user_kernel_shared_address_space" boolean, and then
machine__kernel_ip() can take a cpumode parameter and it can thus say:

if (cpumode == PERF_RECORD_MISC_USER &&
!user_kernel_shared_address_space)
return false;

Comments?