perf overlapping maps...

From: David Miller
Date: Sat Oct 20 2018 - 00:05:54 EST



Symbols aren't exactly right all the time on sparc and even the owner
of a sample is set to "unknown" from time to time so I turned on some
debugging to investigate.

One thing that stands out is that we get overlapping maps all the
time.

So I tried to narrow down how this happens. Here is one case, we get
a new thread fork event for emacs-gtk before the MMAP events so we go:

thread__fork(thread, parent, timestamp)
{
...
thread__clone_map_groups(thread, parent)
{
...
map_groups__clone(thread, parent->mg)

Dumping this map_groups__clone() operation I see:

map_groups__clone: parent 0x10000425420 --> 0x10000418fb0
map_groups__clone: new [0000010000000000:0000010000110000] /bin/bash
map_groups__clone: new [0000010000212000:000001000021e000] /bin/bash
map_groups__clone: new [000001000021e000:00000100002a0000] /tmp/perf-1309.map
map_groups__clone: new [fff0000100000000:fff0000100024000] /lib/sparc64-linux-gnu/ld-2.27.so
map_groups__clone: new [fff0000100124000:fff0000100126000] /lib/sparc64-linux-gnu/ld-2.27.so
map_groups__clone: new [fff0000100128000:fff0000100152000] /lib/sparc64-linux-gnu/libtinfo.so.6.1
map_groups__clone: new [fff0000100254000:fff0000100256000] /lib/sparc64-linux-gnu/libtinfo.so.6.1
map_groups__clone: new [fff0000100258000:fff000010025c000] /lib/sparc64-linux-gnu/libdl-2.27.so
map_groups__clone: new [fff000010035c000:fff000010035e000] /lib/sparc64-linux-gnu/libdl-2.27.so
map_groups__clone: new [fff000010046a000:fff000010046c000] [vdso]
map_groups__clone: new [fff000010046c000:fff00001005cc000] /lib/sparc64-linux-gnu/libc-2.27.so
map_groups__clone: new [fff00001006d0000:fff00001006d4000] /lib/sparc64-linux-gnu/libc-2.27.so
map_groups__clone: new [fff00001006d4000:fff00001006d6000] /tmp/perf-1309.map
map_groups__clone: new [fff0000100874000:fff000010087e000] /lib/sparc64-linux-gnu/libnss_files-2.27.so
map_groups__clone: new [fff000010097e000:fff0000100980000] /lib/sparc64-linux-gnu/libnss_files-2.27.so
map_groups__clone: new [fff0000100980000:fff0000100986000] /tmp/perf-1309.map

It's inheriting maps for the parent bash shell that invoked emacs-gtk, which
makes no sense at all.

We proceed to process the MMAP events which have the proper mappings
for emacs-gtk, and eventually we happen to hit a mapping that overlaps
with one of the address ranges of the parent bash shell.

For the stuff that doesn't overlap, we have bogus parent bash shell
process mappings left in the emacs-gtk thread map group.

The above trace is simply from "./perf record 2>x.log", nothing fancy.

What we are doing above can't be right.

Yes, when processing real perf events from the kernel for a fork
event, we should do that inheritance stuff. But if we are
synthesizing the fork to build threads and maps for already running
processes, we absolutely should not perform the map groups clone.

One solution I've come up with is:

1) When synthesizing a fork event, set PERF_RECORD_MISC_COMM_EXEC in
header->misc.

2) Use this to elide the map groups clone in
thread__clone_map_groups().

Comments?