Re: [PATCH/RFC 4/9] perf symbols: Load kernel module symbols ASAP

From: Namhyung Kim
Date: Sun Jun 25 2017 - 10:19:17 EST


On Fri, Jun 23, 2017 at 11:26:04AM -0300, Arnaldo Carvalho de Melo wrote:
> Em Fri, Jun 23, 2017 at 02:48:22PM +0900, Namhyung Kim escreveu:
> > When loading kernel symbols from /proc/kallsyms, it might have different
> > addresses for modules. We should honor the mmap event recorded in a
> > perf.data so load the module symbols when it sees the event so that it
> > cannot be overridden by symbols in /proc/kallsyms later.
>
> > Cc: Adrian Hunter <adrian.hunter@xxxxxxxxx>
> > Cc: Wang Nan <wangnan0@xxxxxxxxxx>
> > Signed-off-by: Namhyung Kim <namhyung@xxxxxxxxxx>
> > ---
> > tools/perf/util/machine.c | 3 +++
> > 1 file changed, 3 insertions(+)
> >
> > diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
> > index d838f893e639..799efe920f0c 100644
> > --- a/tools/perf/util/machine.c
> > +++ b/tools/perf/util/machine.c
> > @@ -656,6 +656,9 @@ struct map *machine__findnew_module_map(struct machine *machine, u64 start,
> >
> > map_groups__insert(&machine->kmaps, map);
> >
> > + if (dso->has_build_id)
> > + map__load(map);
>
> How could this be set at this point? I just tried processing a
> PERF_RECORD_MMAP event for a kernel module and at this point it is set
> to false :-\
>
> Humm, but I did it for video.ko, that has no hits in this case, so this
> must be coming from reading the perf.data build-id table... /me tries
> that... yeah, when processing the buildid table in perf.data we
> populate the machine->dsos list and set the dso->has_build_id flag:
>
> [root@jouet ~]# perf buildid-list | grep .ko
> 57a47c2f0d85ef5726b80e8f55403c3f508a4eac /lib/modules/4.12.0-rc4+/kernel/drivers/net/wireless/intel/iwlwifi/iwlwifi.ko
> [root@jouet ~]#
> (gdb) bt
> #0 machine__findnew_module_map (machine=0x21c6c78, start=18446744072647282688, filename=0x7fd8fc057a00 "/lib/modules/4.12.0-rc4+/kernel/drivers/net/wireless/intel/iwlwifi/iwlwifi.ko")
> at util/machine.c:650
> #1 0x00000000005094a3 in machine__process_kernel_mmap_event (machine=0x21c6c78, event=0x7fd8fc0579d8) at util/machine.c:1274
> #2 0x00000000005099de in machine__process_mmap_event (machine=0x21c6c78, event=0x7fd8fc0579d8, sample=0x7fffffffbef0) at util/machine.c:1433
> #3 0x00000000004cc021 in perf_event__process_mmap (tool=0x7fffffffc440, event=0x7fd8fc0579d8, sample=0x7fffffffbef0, machine=0x21c6c78) at util/event.c:1265
> #4 0x0000000000512686 in machines__deliver_event (machines=0x21c6c78, evlist=0x21c6f30, event=0x7fd8fc0579d8, sample=0x7fffffffbef0, tool=0x7fffffffc440, file_offset=10712)
> at util/session.c:1250
> #5 0x00000000005129ce in perf_session__deliver_event (session=0x21c6b90, event=0x7fd8fc0579d8, sample=0x7fffffffbef0, tool=0x7fffffffc440, file_offset=10712) at util/session.c:1310
> #6 0x0000000000513255 in perf_session__process_event (session=0x21c6b90, event=0x7fd8fc0579d8, file_offset=10712) at util/session.c:1490
> #7 0x0000000000513ff4 in __perf_session__process_events (session=0x21c6b90, data_offset=232, data_size=23120, file_size=23352) at util/session.c:1867
> #8 0x00000000005141f2 in perf_session__process_events (session=0x21c6b90) at util/session.c:1921
> #9 0x00000000004427c9 in __cmd_report (rep=0x7fffffffc440) at builtin-report.c:575
> #10 0x000000000044427c in cmd_report (argc=0, argv=0x7fffffffe160) at builtin-report.c:1054
> #11 0x00000000004bad91 in run_builtin (p=0xa28fd0 <commands+240>, argc=2, argv=0x7fffffffe160) at perf.c:296
> #12 0x00000000004baffe in handle_internal_command (argc=2, argv=0x7fffffffe160) at perf.c:348
> #13 0x00000000004bb150 in run_argv (argcp=0x7fffffffdfbc, argv=0x7fffffffdfb0) at perf.c:392
> #14 0x00000000004bb52a in main (argc=2, argv=0x7fffffffe160) at perf.c:530
> (gdb) p dso
> $127 = (struct dso *) 0x21c7dc0
> (gdb) p dso->has_build_id
> $128 = 1 '\001'
> (gdb)
>
> So what you're doing is to load it here... perhaps we should not do
> that, and instead in the kallsyms loading code, when processing the
> symbols for this specific module, discard them and leave its symbol
> table unpopulated, then, later, when resolving a sample in this module,
> it will see that it is not loaded, will see that it has a build-id and
> will use that, and _refuse_ to get it from anywhere else than an ELF
> file with that build-id.
>
> I.e. in general, when we have a build-id for a given DSO, we should not
> read it from kallsyms or any other file that don't have a matching
> build-id.
>
> The only possibility would be: hey, I have a build id, and reading the
> currently loaded module (/sys/module/e1000e/notes/.note.gnu.build-id) I
> see it _still_ has that same build-id, ok, I can read it from kallsyms.
>
> What do you think?

I'm ok with that, will change to compare build-id before loading
symbols.

Thanks,
Namhyung