Re: [PATCH perf/core 1/4] perf-probe: Fix to show correct locations for events on modules

From: Masami Hiramatsu
Date: Tue Jan 10 2017 - 18:52:02 EST


On Tue, 10 Jan 2017 10:18:35 -0300
Arnaldo Carvalho de Melo <acme@xxxxxxxxxx> wrote:

> Em Sat, Jan 07, 2017 at 02:25:09PM +0900, Masami Hiramatsu escreveu:
> > Fix to show correct locations for events on modules by
> > relocating given address. Currently the relocation is
> > done when we failed to find the address in debuginfo,
> > but for modules it always makes a mistakes.
>
> Try to provide precise instructions on how to reproduce, for instance,
> here I'm not being able to reproduce:
>
> [root@jouet ~]# perf probe -m i915 chv_prepare_pll
> Added new event:
> probe:chv_prepare_pll (on chv_prepare_pll in i915)
>
> You can now use it in all perf tools, such as:
>
> perf record -e probe:chv_prepare_pll -aR sleep 1
>
> [root@jouet ~]# perf probe -l
> probe:chv_prepare_pll (on chv_prepare_pll in i915)
> probe:e1000_xmit_frame (on e1000_get_link_up_info_80003es2lan:7@intel/e1000e/80003es2lan.c in e1000e)
> [root@jouet ~]#
>
> So it doesn't seem to "always make mistakes", what are the precise
> conditions to reproduce this problem?

OK, I found mymistakes. chv_prepare_pll in my i915 module is optimized
and have isra.X suffix, so perf ends up with searching it in map.
(That issue is fixed in [3/4] in this series as you may know)


This happens when the module text size is enough big, bigger than
sh_addr, because original code retries with given address + sh_addr
if it failed to find CU DIE at the given address. Any address smaller
than sh_addr always fails and it retries.

On my environment, the sh_addr of ".text" section is 0x10030. Since
i915 is a huge kernel module, we can see this issue as below.

$ grep "[Tt] .*\[i915\]" /proc/kallsyms | sort | head -n1
ffffffffc0270000 t i915_switcheroo_can_switch [i915]

ffffffffc0270000 + 0x10030 = ffffffffc0280030, so we'll check
symbols cross this boundary.

$ grep "[Tt] .*\[i915\]" /proc/kallsyms | grep -B1 ^ffffffffc028 | head -n 2
ffffffffc027ff80 t haswell_init_clock_gating [i915]
ffffffffc0280110 t valleyview_init_clock_gating [i915]

So setup probes on both function and see what happen.

$ sudo ./perf probe -m i915 -a haswell_init_clock_gating -a valleyview_init_clock_gating
Added new events:
probe:haswell_init_clock_gating (on haswell_init_clock_gating in i915)
probe:valleyview_init_clock_gating (on valleyview_init_clock_gating in i915)

You can now use it in all perf tools, such as:

perf record -e probe:valleyview_init_clock_gating -aR sleep 1

$ sudo ./perf probe -l
probe:haswell_init_clock_gating (on haswell_init_clock_gating@gpu/drm/i915/intel_pm.c in i915)
probe:valleyview_init_clock_gating (on i915_vga_set_decode:4@gpu/drm/i915/i915_drv.c in i915)

As you can see, haswell_init_clock_gating is correctly shown, but
valleyview_init_clock_gating is not.

With this patch, both events shown correctly.

$ sudo ./perf probe -l
probe:haswell_init_clock_gating (on haswell_init_clock_gating@gpu/drm/i915/intel_pm.c in i915)
probe:valleyview_init_clock_gating (on valleyview_init_clock_gating@gpu/drm/i915/intel_pm.c in i915)

I'll update patch description and resend with fix of [4/4].

Thank you,

--
Masami Hiramatsu <mhiramat@xxxxxxxxxx>