Re: [PATCH 2/2] perf hist: Fix bogus profiles when filters are enabled
From: Namhyung Kim
Date: Fri Jan 10 2025 - 19:30:08 EST
On Fri, Jan 10, 2025 at 01:59:54PM +0100, Dmitry Vyukov wrote:
> On Thu, 9 Jan 2025 at 22:59, Namhyung Kim <namhyung@xxxxxxxxxx> wrote:
> >
> > On Wed, Jan 08, 2025 at 08:36:54AM +0100, Dmitry Vyukov wrote:
> > > When a filtered column is not present in the sort order, profiles become
> > > arbitrary broken. Filtered and non-filtered entries are collapsed
> > > together, and the filtered-by field ends up with a random value (either
> > > from a filtered or non-filtered entry). If we end up with filtered
> > > entry/value, then the whole collapsed entry will be filtered out and will
> > > be missing in the profile. If we end up with non-filtered entry/value,
> > > then the overhead value will be wrongly larger (include some subset
> > > of filtered out samples).
> > >
> > > This leads to very confusing profiles. The problem is hard to notice,
> > > and if noticed hard to understand. If the filter is for a single value,
> > > then it can be fixed by adding the corresponding field to the sort order
> > > (provided user understood the problem). But if the filter is for multiple
> > > values, it's impossible to fix b/c there is no concept of binary sorting
> > > based on filter predicate (we want to group all non-filtered values in
> > > one bucket, and all filtered values in another).
> > >
> > > Examples of affected commands:
> > > perf report --tid=123
> > > perf report --sort overhead,symbol --comm=foo,bar
> > >
> > > Fix this by considering filtered status as the highest priority
> > > sort/collapse predicate.
> >
> > Do you mean when you specify filters not in a sort list?
> >
> > I guess it's an undefined behavior.. probably better to warn users not
> > to do that and exit. The hist entries are built using only the fields
> > in the sort list. Filtering by unspecified fields would be meaningless
> > and won't work well IMHO.
>
> We could warn as well, but I do want this feature for:
> https://lore.kernel.org/linux-perf-users/CACT4Y+an1LSY15f9MS_vnbaaeeqMf+k4-Dqqfu-zwcUAHFNk=w@xxxxxxxxxxxxxx/T/#t
> I think it may be useful for other cases as well, let's discuss the
> feature below.
> Why won't it work well? At least in my local manual testing, it worked
> well. This does make hist built using filtered fields as well.
I thought hist entry was constructed only for the fields in the sort
list and then apply the filter on the hist entry. But looking at the
code, I've realized that the filter is set before the hist entry is
created. Then it would work..
>
>
> > > As a side effect this effectively adds a new feature of showing profile
> > > where several lines are combined based on arbitrary filtering predicate.
> > > For example, showing symbols from binaries foo and bar combined together,
> > > but not from other binaries; or showing combined overhead of several
> > > particular threads.
> >
> > I'm not sure I'm following. I'm seeing this even with your change.
> >
> > Without filters:
> >
> > $ ./perf report -s dso --stdio -q
> > 98.41% perf
> > 1.11% [kernel.kallsyms]
> > 0.41% ld-linux-x86-64.so.2
> > 0.05% libLLVM-16.so.1
> > 0.03% libc.so.6
> >
> >
> > With filter:
> >
> > $ ./perf report -s dso -d '[kernel.kallsyms],libc.so.6' --stdio -q
> > 1.11% [kernel.kallsyms]
> > 0.03% libc.so.6
> >
> > You said you want to have them combined together, right?
> > But I think the current behavior makes more sense.
>
>
> If you add a field to both sort and filter, then this change does not
> affect anything (filtered fields already participate in
> sorting/merging). It affects things when the filter field is not in
> sort.
>
> Consider the following queries:
> "What symbols consume time in this subset of binaries (but not in the rest)?"
> Naively, one would try:
> perf report --sort=symbol --comm=cc1,ld
> And this deceptively works and shows something, but as far as I
> understand these are random numbers now.
Right, that's what I expected. But I agree with you that this type of
filtering can be useful and intuitive.
> With this change you can actually do this (you will see things like
> malloc/free in both of these binaries, or if one profiles e.g. Go
> programs, then there will be lots of common symbols from
> runtime/stdlin).
Yep, by not merging the filtered entries you can get the precise profile
of symbols from the specified binaries only.
>
> Another example: profile for this particular subset of threads/cpus
> (but not for the rest).
>
> And in particular for:
> https://lore.kernel.org/linux-perf-users/CACT4Y+an1LSY15f9MS_vnbaaeeqMf+k4-Dqqfu-zwcUAHFNk=w@xxxxxxxxxxxxxx/T/#t
> I want to enable: --parallelism=1-8 filter.
> Currently you will either see random garbage numbers, or if you add
> 'parallelism' to sort order, then you will see each symbol/comm
> duplicated 8 times for each separate value of parallelism (which is
> not useful). But currently you can't see symbol/comm for these 8
> parallelism levels combined, but not for the rest.
Interesting, I'll take a look at it later.
>
> All-in-all, this looks like a change that defines behavior that's
> currently undefined in a useful and intuitive way, without changing
> any of the currently defined behaviors.
Makes sense.
Thanks,
Namhyung