Re: [PATCH 1/1] Revert "perf report: Append inlines to non-DWARF callchains"

From: Arnaldo Carvalho de Melo
Date: Mon Aug 07 2023 - 09:35:07 EST


Em Mon, Aug 07, 2023 at 01:00:08PM +0200, Artem Savkov escreveu:
> On Wed, Aug 02, 2023 at 09:43:40AM +0200, Artem Savkov wrote:
> > Hi Arnaldo,
> >
> > On Tue, Aug 01, 2023 at 06:42:47PM -0300, Arnaldo Carvalho de Melo wrote:
> > > Hi Artem,
> > >
> > > Can you please double check this? I reproduced with:
> > >
> > > git checkout 46d21ec067490ab9cdcc89b9de5aae28786a8b8e
> > > build it
> > > perf record -a -g sleep 5s
> > > perf report
> > >
> > > Do you get the same slowness and then reverting it, i.e. just
> > > going to HEAD~ and rebuilding getting a fast 'perf report' startup, i.e.
> > > without the inlines in the callchains?
> >
> > With a simple test like this I definitely get a slowdown, but not sure
> > if it can be called excessive.
> >
> > Below are the times I got by running 'time perf report' and hitting 'q'
> > during load so that it quits as soon as it is loads up. Tested on a
> > freshly updated fedora 38.
>
> My bad, I had wrong debuginfo installed for the kernel I tested. I can
> reproduce it with the correct one. Looks like vmlinux is just too much
> for addr2line. Maybe we can skip it but leave other inlines in, like so:

That is a possibilit, and probably we could make it cheaper by looking
at the cpumode, avoiding calling addr2line when we didn't makage to
resolve the symbol, etc.

We also may want to have this as an option that has to be explicitely
enabled, like --resolve-inlines, as this will add overhead no matter if
we stop calling addr2line and do it more efficiently, etc.

Fact is, we're late in the 6.5 schedule, so the best thing now is to
just revert the patch and then try again later, ok?

- Arnaldo

> diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
> index 11de3ca8d4fa7..fef309cd401f7 100644
> --- a/tools/perf/util/machine.c
> +++ b/tools/perf/util/machine.c
> @@ -2388,7 +2388,9 @@ static int add_callchain_ip(struct thread *thread,
> ms.map = map__get(al.map);
> ms.sym = al.sym;
>
> - if (!branch && append_inlines(cursor, &ms, ip) == 0)
> + if (!branch && ms.map && ms.map->dso &&
> + strcmp(ms.map->dso->short_name, "[kernel.vmlinux]") &&
> + append_inlines(cursor, &ms, ip) == 0)
> goto out;
>
> srcline = callchain_srcline(&ms, al.addr);
>
> > > - Arnaldo
> > >
> > > ----
> > >
> > > This reverts commit 46d21ec067490ab9cdcc89b9de5aae28786a8b8e.
> > >
> > > The tests were made with a specific workload, further tests on a
> > > recently updated fedora 38 system with a system wide perf.data file
> > > shows 'perf report' taking excessive time, so lets revert this until a
> > > full investigation and improvement on the addr2line support code is
> > > made.
> > >
> > > Cc: Andrii Nakryiko <andrii.nakryiko@xxxxxxxxx>
> > > Cc: Artem Savkov <asavkov@xxxxxxxxxx>
> > > Cc: Namhyung Kim <namhyung@xxxxxxxxxx>
> > > Cc: Adrian Hunter <adrian.hunter@xxxxxxxxx>
> > > Cc: Alexander Shishkin <alexander.shishkin@xxxxxxxxxxxxxxx>
> > > Cc: Ian Rogers <irogers@xxxxxxxxxx>
> > > Cc: Ingo Molnar <mingo@xxxxxxxxxx>
> > > Cc: Jiri Olsa <jolsa@xxxxxxxxxx>
> > > Cc: Mark Rutland <mark.rutland@xxxxxxx>
> > > Cc: Masami Hiramatsu <mhiramat@xxxxxxxxxx>
> > > Cc: Milian Wolff <milian.wolff@xxxxxxxx>
> > > Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
> > > Signed-off-by: Arnaldo Carvalho de Melo <acme@xxxxxxxxxx>
> > > ---
> > > tools/perf/util/machine.c | 5 -----
> > > 1 file changed, 5 deletions(-)
> > >
> > > diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
> > > index 4e62843d51b7dbf9..f4cb41ee23cdbcfc 100644
> > > --- a/tools/perf/util/machine.c
> > > +++ b/tools/perf/util/machine.c
> > > @@ -45,7 +45,6 @@
> > >
> > > static void __machine__remove_thread(struct machine *machine, struct thread_rb_node *nd,
> > > struct thread *th, bool lock);
> > > -static int append_inlines(struct callchain_cursor *cursor, struct map_symbol *ms, u64 ip);
> > >
> > > static struct dso *machine__kernel_dso(struct machine *machine)
> > > {
> > > @@ -2385,10 +2384,6 @@ static int add_callchain_ip(struct thread *thread,
> > > ms.maps = maps__get(al.maps);
> > > ms.map = map__get(al.map);
> > > ms.sym = al.sym;
> > > -
> > > - if (!branch && append_inlines(cursor, &ms, ip) == 0)
> > > - goto out;
> > > -
> > > srcline = callchain_srcline(&ms, al.addr);
> > > err = callchain_cursor_append(cursor, ip, &ms,
> > > branch, flags, nr_loop_iter,
> > > --
> > > 2.41.0
> > >
> >
> > --
> > Artem
>
> --
> Artem
>

--

- Arnaldo