Re: [PATCH v6 6/6] perf util: use correct IP mapping to find srcline for hist entry

From: Milian Wolff
Date: Tue Oct 24 2017 - 04:59:58 EST


On Freitag, 20. Oktober 2017 07:15:33 CEST Namhyung Kim wrote:
> Hi Milian,
>
> On Thu, Oct 19, 2017 at 12:54:18PM +0200, Milian Wolff wrote:
> > On Mittwoch, 18. Oktober 2017 20:53:50 CEST Milian Wolff wrote:
> > > When inline frame resolution is disabled, a bogus srcline is obtained
> > > for hist entries:
> > >
> > > ~~~~~
> > > $ perf report -s sym,srcline --no-inline --stdio -g none
> > >
> > > 95.21% 0.00% [.] __libc_start_main
> > >
> > > __libc_start_main+18446603358170398953 95.21% 0.00% [.] _start
> > >
> > > _start+18446650082411225129 46.67% 0.00%
> > > [.]
> > >
> > > main
> > >
> > > main+18446650082411225208 38.75%
> > >
> > > 0.00% [.] hypot
> > >
> > > hypot+18446603358164312084 23.75% 0.00% [.] main
> > >
> > > main+18446650082411225151 20.83% 20.83% [.]
> > >
> > > std::generate_canonical<double, 53ul,
> > > std::linear_congruential_engine<unsigned long, 16807ul, 0ul,
> > > 2147483647ul>
> > >
> > > > random.h:143 18.12% 0.00% [.] main
> > >
> > > main+18446650082411225165 13.12% 13.12% [.]
> > >
> > > std::generate_canonical<double, 53ul,
> > > std::linear_congruential_engine<unsigned long, 16807ul, 0ul,
> > > 2147483647ul>
> > >
> > > > random.tcc:3330 4.17% 4.17% [.] __hypot_finite
> > > >
> > > __hypot_finite+163 4.17% 4.17% [.]
> > > std::generate_canonical<double,
> > >
> > > 53ul, std::linear_congruential_engine<unsigned long, 16807ul, 0ul,
> > > 2147483647ul> > random.tcc:3333 4.17% 0.00% [.] __hypot_finite
> > >
> > > __hypot_finite+18446603358164312227 4.17% 0.00%
> > > [.]
> > >
> > > std::generate_canonical<double, 53ul,
> > > std::linear_congruential_engine<unsigned long, 16807ul, 0ul,
> > > 2147483647ul>
> > >
> > > > std::generate_canonical<double, 53ul, std::line 2.92% 0.00% [.]
> > >
> > > std::generate_canonical<double, 53ul,
> > > std::linear_congruential_engine<unsigned long, 16807ul, 0ul,
> > > 2147483647ul>
> > >
> > > > std::generate_canonical<double, 53ul, std::line 2.50% 2.50% [.]
> > >
> > > __hypot_finite
> > >
> > > __hypot_finite+11 2.50%
> > > 2.50%
> > >
> > > [.] __hypot_finite
> > >
> > > __hypot_finite+24 2.50%
> > >
> > > 0.00% [.] __hypot_finite
> > >
> > > __hypot_finite+18446603358164312075 2.50% 0.00% [.] __hypot_finite
> > >
> > > __hypot_finite+18446603358164312088 ~~~~~
> > >
> > > Note how we get very large offsets to main and cannot see any srcline
> > > from one of the complex or random headers, even though the instruction
> > > pointers actually lie in code inlined from there.
> > >
> > > This patch fixes the mapping to use map__objdump_2mem instead of
> > > map__objdump_2mem in hist_entry__get_srcline. This fixes the srcline
> > > values for me when inline resolution is disabled:
> > >
> > > ~~~~~
> > > $ perf report -s sym,srcline --no-inline --stdio -g none
> > >
> > > 95.21% 0.00% [.] __libc_start_main
> > >
> > > __libc_start_main+233 95.21% 0.00% [.] _start
> > >
> > > _start+41 46.88% 0.00% [.] main
> > >
> > > complex:589 43.96% 0.00% [.] main
> > >
> > > random.h:185 38.75% 0.00% [.] hypot
> > >
> > > hypot+20 20.83% 0.00% [.] std::generate_canonical<double, 53ul,
> > >
> > > std::linear_congruential_engine<unsigned long, 16807ul, 0ul,
> > > 2147483647ul>
> > >
> > > > random.h:143 13.12% 0.00% [.] std::generate_canonical<double,
> > > > 53ul,
> > >
> > > std::linear_congruential_engine<unsigned long, 16807ul, 0ul,
> > > 2147483647ul>
> > >
> > > > random.tcc:3330 4.17% 4.17% [.] __hypot_finite
> > > >
> > > __hypot_finite+140715545239715 4.17% 4.17% [.]
> > >
> > > std::generate_canonical<double, 53ul,
> > > std::linear_congruential_engine<unsigned long, 16807ul, 0ul,
> > > 2147483647ul>
> > >
> > > > std::generate_canonical<double, 53ul, std::line 4.17% 0.00% [.]
> > >
> > > __hypot_finite
> > >
> > > __hypot_finite+163 4.17%
> > > 0.00%
> > >
> > > [.] std::generate_canonical<double, 53ul,
> > > std::linear_congruential_engine<unsigned long, 16807ul, 0ul,
> > > 2147483647ul>
> > >
> > > > random.tcc:3333 2.92% 2.92% [.] std::generate_canonical<double,
> > >
> > > 53ul, std::linear_congruential_engine<unsigned long, 16807ul, 0ul,
> > > 2147483647ul> > std::generate_canonical<double, 53ul, std::line 2.50%
> > > 2.50% [.] __hypot_finite
> > >
> > > __hypot_finite+140715545239563 2.50% 2.50% [.] __hypot_finite
> > >
> > > __hypot_finite+140715545239576 2.50% 2.50% [.]
> > >
> > > std::generate_canonical<double, 53ul,
> > > std::linear_congruential_engine<unsigned long, 16807ul, 0ul,
> > > 2147483647ul>
> > >
> > > > std::generate_canonical<double, 53ul, std::line 2.50% 2.50% [.]
> > >
> > > std::generate_canonical<double, 53ul,
> > > std::linear_congruential_engine<unsigned long, 16807ul, 0ul,
> > > 2147483647ul>
> > >
> > > > std::generate_canonical<double, 53ul, std::line 2.50% 0.00% [.]
> > >
> > > __hypot_finite
> > >
> > > __hypot_finite+11 ~~~~~
> > >
> > > Cc: Arnaldo Carvalho de Melo <acme@xxxxxxxxxx>
> > > Cc: Namhyung Kim <namhyung@xxxxxxxxxx>
> > > Cc: Yao Jin <yao.jin@xxxxxxxxxxxxxxx>
> > > Cc: Jiri Olsa <jolsa@xxxxxxxxxx>
> > > Signed-off-by: Milian Wolff <milian.wolff@xxxxxxxx>
> > >
> > > Note how most of the large offset values are now gone. Most notably,
> > > we get proper srcline resolution for the random.h and complex headers.
> > > ---
> > >
> > > tools/perf/util/sort.c | 2 +-
> > > 1 file changed, 1 insertion(+), 1 deletion(-)
> > >
> > > diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
> > > index 006d10a0dc96..6f3d109078a3 100644
> > > --- a/tools/perf/util/sort.c
> > > +++ b/tools/perf/util/sort.c
> > > @@ -334,7 +334,7 @@ char *hist_entry__get_srcline(struct hist_entry *he)
> > >
> > > if (!map)
> > >
> > > return SRCLINE_UNKNOWN;
> > >
> > > - return get_srcline(map->dso, map__rip_2objdump(map, he->ip),
> > > + return get_srcline(map->dso, map__objdump_2mem(map, he->ip),
> > >
> > > he->ms.sym, true, true);
> > >
> > > }
> >
> > Sorry, this patch was declined by Nahmyung before, please discard it - I
> > forgot to do that before resending v6.
>
> I looked into it and found a bug handling cumulative (children)
> entries. For chilren entries that has no self period, the al->addr
> (so he->ip) ends up having an doubly-mapped address.
>
> It seems to be there from the beginning but only affects entries that
> have no srclines - finding srcline itself is done using a different
> address but it will show the invalid address if no srcline was found.
> I think we should fix the commit c7405d85d7a3 ("perf tools: Update
> cpumode for each cumulative entry").
>
> Could you please test the following patch works for you?

Sorry for the delay, nearly forgot about this mail. The patch below does help
in my situation, thanks! Can you commit it please?

> diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
> index 35a920f09503..d18cdcc8d132 100644
> --- a/tools/perf/util/callchain.c
> +++ b/tools/perf/util/callchain.c
> @@ -1074,10 +1074,7 @@ int fill_callchain_info(struct addr_location *al,
> struct callchain_cursor_node * {
> al->map = node->map;
> al->sym = node->sym;
> - if (node->map)
> - al->addr = node->map->map_ip(node->map, node->ip);
> - else
> - al->addr = node->ip;
> + al->addr = node->ip;
>
> if (al->sym == NULL) {
> if (hide_unresolved)



--
Milian Wolff | milian.wolff@xxxxxxxx | Senior Software Engineer
KDAB (Deutschland) GmbH&Co KG, a KDAB Group company
Tel: +49-30-521325470
KDAB - The Qt Experts