Re: [PATCH v3 6/7] perf report: Add --latency flag

From: Dmitry Vyukov
Date: Mon Feb 03 2025 - 09:59:09 EST


On Thu, 30 Jan 2025 at 07:30, Namhyung Kim <namhyung@xxxxxxxxxx> wrote:
>
> On Wed, Jan 29, 2025 at 08:12:51AM +0100, Dmitry Vyukov wrote:
> > On Wed, 29 Jan 2025 at 06:03, Namhyung Kim <namhyung@xxxxxxxxxx> wrote:
> > >
> > > On Mon, Jan 27, 2025 at 10:58:53AM +0100, Dmitry Vyukov wrote:
> > > > Add record/report --latency flag that allows to capture and show
> > > > latency-centric profiles rather than the default CPU-consumption-centric
> > > > profiles. For latency profiles record captures context switch events,
> > > > and report shows Latency as the first column.
> > >
> > > It'd be nice if you could add a small example output in the commit
> > > message.
> > >
> > > >
> > > > Signed-off-by: Dmitry Vyukov <dvyukov@xxxxxxxxxx>
> > > > Cc: Namhyung Kim <namhyung@xxxxxxxxxx>
> > > > Cc: Arnaldo Carvalho de Melo <acme@xxxxxxxxxx>
> > > > Cc: Ian Rogers <irogers@xxxxxxxxxx>
> > > > Cc: linux-perf-users@xxxxxxxxxxxxxxx
> > > > Cc: linux-kernel@xxxxxxxxxxxxxxx
> > > > ---
> > > [SNIP]
> > > > diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
> > > > index bc4c3acfe7552..2b6023de7a53a 100644
> > > > --- a/tools/perf/util/sort.c
> > > > +++ b/tools/perf/util/sort.c
> > > > @@ -2622,6 +2622,7 @@ struct hpp_dimension {
> > > > const char *name;
> > > > struct perf_hpp_fmt *fmt;
> > > > int taken;
> > > > + int was_taken;
> > > > };
> > > >
> > > > #define DIM(d, n) { .name = n, .fmt = &perf_hpp__format[d], }
> > > > @@ -3513,6 +3514,7 @@ static int __hpp_dimension__add(struct hpp_dimension *hd,
> > > > return -1;
> > > >
> > > > hd->taken = 1;
> > > > + hd->was_taken = 1;
> > > > perf_hpp_list__register_sort_field(list, fmt);
> > > > return 0;
> > > > }
> > > > @@ -3547,10 +3549,15 @@ static int __hpp_dimension__add_output(struct perf_hpp_list *list,
> > > > return 0;
> > > > }
> > > >
> > > > -int hpp_dimension__add_output(unsigned col)
> > > > +int hpp_dimension__add_output(unsigned col, bool implicit)
> > > > {
> > > > + struct hpp_dimension *hd;
> > > > +
> > > > BUG_ON(col >= PERF_HPP__MAX_INDEX);
> > > > - return __hpp_dimension__add_output(&perf_hpp_list, &hpp_sort_dimensions[col]);
> > > > + hd = &hpp_sort_dimensions[col];
> > > > + if (implicit && !hd->was_taken)
> > > > + return 0;
> > >
> > > I don't understand why you need 'was_taken' field. Can it use the
> > > 'taken' field?
> >
> > I've started getting failed asserts in
> > perf_hpp__cancel_cumulate/latency when removing columns still linked
> > into sort order list.
>
> Ok.
>
> >
> > I've figured out that columns we implicitly add in setup_overhead and
> > perf_hpp__init must match exactly. If we add one in setup_overhead,
> > but not in perf_hpp__init, then it starts failing.
>
> Right, I don't remember why we don't have unregister_sort_field or let
> the perf_hpp__column_unregister() remove it from the sort_list too.
>
> >
> > Taken does not work b/c there is reset_dimensions call between these
> > functions, so they actually add the same columns twice to field/sort
> > lists (and that prevents failures in
> > perf_hpp__cancel_cumulate/latency).
>
> I see.
>
> >
> > Initially I've just tried to match logic in setup_overhead and in
> > perf_hpp__init. But it turned out quite messy, duplicate logic, and
> > e.g. in setup_overhead we look at sort_order, but in perf_hpp__init we
> > already can't look at it b/c we already altered it in setup_overhead.
> > That logic would also be quite fragile. Adding was_taken looks like
> > the simplest, most reliable, and less fragile with respect to future
> > changes approach.
>
> Maybe just remove it from the sort_list. There should be no reason to
> keep it in the list.

I've tried that (and removing setup_overhead entirely), but none of
that yielded the same result.

E.g.:

perf report --sort latency,symbol

currently gives intended:

Latency Symbol
4.37% [k] native_queued_spin_lock_slowpath
2.71% [.] _PyEval_EvalFrameDefault
2.18% [.] iterative_hash
2.00% [k] clear_page_erms

but if I remove was_taken logic and make perf_hpp__column_unregister()
remove from the sort list, I get:

Overhead Latency Symbol
38.17% 4.37% [k] native_queued_spin_lock_slowpath
0.86% 2.71% [.] _PyEval_EvalFrameDefault
0.47% 2.18% [.] iterative_hash
0.68% 2.00% [k] clear_page_erms

So, so far the current version of the code is unfortunately the
simplest version with intended behavior that I come up with.



> > > > + return __hpp_dimension__add_output(&perf_hpp_list, hd);
> > > > }
> > > >
> > > > int sort_dimension__add(struct perf_hpp_list *list, const char *tok,
> > > > @@ -3809,10 +3816,24 @@ static char *setup_overhead(char *keys)
> > > > if (sort__mode == SORT_MODE__DIFF)
> > > > return keys;
> > > >
> > > > - keys = prefix_if_not_in("overhead", keys);
> > > > -
> > > > - if (symbol_conf.cumulate_callchain)
> > > > - keys = prefix_if_not_in("overhead_children", keys);
> > > > + if (symbol_conf.prefer_latency) {
> > > > + keys = prefix_if_not_in("overhead", keys);
> > > > + keys = prefix_if_not_in("latency", keys);
> > > > + if (symbol_conf.cumulate_callchain) {
> > > > + keys = prefix_if_not_in("overhead_children", keys);
> > > > + keys = prefix_if_not_in("latency_children", keys);
> > > > + }
> > > > + } else if (!keys || (!strstr(keys, "overhead") &&
> > > > + !strstr(keys, "latency"))) {
> > > > + if (symbol_conf.enable_latency)
> > > > + keys = prefix_if_not_in("latency", keys);
> > > > + keys = prefix_if_not_in("overhead", keys);
> > > > + if (symbol_conf.cumulate_callchain) {
> > > > + if (symbol_conf.enable_latency)
> > > > + keys = prefix_if_not_in("latency_children", keys);
> > > > + keys = prefix_if_not_in("overhead_children", keys);
> > > > + }
> > > > + }
> > >
> > > Do you really need this complexity? I think it's better to fix the order
> > > of columns in both case.
> >
> > This is sort order which affects ordering of row, and order of rows is
> > basically the most important thing for a profiler. If we fix the
> > order, it will be showing completely different things.
> >
> > "latency" and "overhead" are equivalent with respect to their
> > fundamentality for a profile. So we shouldn't be adding any of them,
> > if any of them are already explicitly specified by the user.
> >
> > For example, the command from tips.txt:
> >
> > perf report --hierarchy --sort latency,parallelism,comm,symbol
> >
> > if we prepend "overhead", it all falls apart.
> >
> > Or for 2 default modes (normals and latency) we want "overhead" or
> > "latency" to come first. Prepending "latency" for the current CPU mode
> > would lead to completely different ordering.
>
> I see. You want to sort by overhead in the default mode even if it has
> implicitly-added latency field, and to sort by latency if --latency is
> given explicitly.
>
> I think it can be done with the following command line.
>
> $ perf report -F overhead,latency,parallelism,comm,sym -s overhead
>
> or
>
> $ perf report -F overhead,latency,parallelism,comm,sym -s latency
>
> IOW, you can keep the same output column ordering and sort the result
> differently. But I'm not sure if it'd make the code simpler. :)
>
> Thanks,
> Namhyung
>