perf handling of unknown symbols...

From: David Miller
Date: Sat Oct 20 2018 - 04:32:58 EST



I was able to, I think, track down why I get unknown symbols with perf
top.

It happens for processes that start up while perf is running.

It's due to races if a process changes it's address space really
quickly while perf is processing all of the events.

If we are probing procfs and/or sysfs or something like that to gather
information before a process exec'd or fork'd, but after it has done
so, the dso lookups and stuff are going to fail.

So we end up with an unresolvable dso, and an al.sym which is NULL and
then we hit this check in perf_event__process_sample() of
builtin-top.c:

if (al.sym == NULL || !al.sym->idle) {

Before 2011 and commit:

commit ab81f3fd350c510730adb1ca40ef55c2b2952121
Author: Arnaldo Carvalho de Melo <acme@xxxxxxxxxx>
Date: Wed Oct 5 19:16:15 2011 -0300

perf top: Reuse the 'report' hist_entry/hists classes

the test was just:

if (!al.sym->ignore) {

and the commit message even mentions:

This actually fixes several problems we had in the old 'perf top':

1. Unresolved symbols not show, limitation that came from the old
"KernelTop" codebase, to solve it we would need to do changes
that would make sym_entry have most of the hist_entry fields.
...

The problem with still processing samples without a resolved symbol is
that these histogram entries created will "poison" that memory area.

We end up with all of these histogram entries, one for every single PC
ever sampled in these temporarily unresolvable areas. So all future
samples match up, and accumulate, into those unresolved entries.

They never get collapsed back into a histogram entry which covers the
eventually resolved symbol's range.

If symbol resolution is in fact racy as I describe above, then we
should handle unresolved symbols better. Especially the case where an
initially unresolved area becomes resolved.