Re: [Patch v5 0/6] Bug fixes on topdown events reordering

From: Ian Rogers
Date: Wed Oct 02 2024 - 20:57:28 EST


On Wed, Oct 2, 2024 at 5:00 PM Namhyung Kim <namhyung@xxxxxxxxxx> wrote:
>
> On Tue, Oct 01, 2024 at 03:32:04PM -0700, Ian Rogers wrote:
> > On Tue, Oct 1, 2024 at 2:02 PM Namhyung Kim <namhyung@xxxxxxxxxx> wrote:
> > >
> > > On Fri, 13 Sep 2024 08:47:06 +0000, Dapeng Mi wrote:
> > >
> > > > Changes:
> > > > v5 -> v6:
> > > > * no function change.
> > > > * rebase patchset to latest code of perf-tool-next tree.
> > > > * Add Kan's reviewed-by tag.
> > > >
> > > > History:
> > > > v4: https://lore.kernel.org/all/20240816122938.32228-1-dapeng1.mi@xxxxxxxxxxxxxxx/
> > > > v3: https://lore.kernel.org/all/20240712170339.185824-1-dapeng1.mi@xxxxxxxxxxxxxxx/
> > > > v2: https://lore.kernel.org/all/20240708144204.839486-1-dapeng1.mi@xxxxxxxxxxxxxxx/
> > > > v1: https://lore.kernel.org/all/20240702224037.343958-1-dapeng1.mi@xxxxxxxxxxxxxxx/
> > > >
> > > > [...]
> > >
> > > Applied to perf-tools-next, thanks!
> >
> > I disagreed with an early patch set and the issue wasn't resolved. Specifically:
> >
> > https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/commit/?h=perf-tools-next&id=3b5edc0421e2598a0ae7f0adcd592017f37e3cdf
> > ```
> > /* Followed by topdown events. */
> > if (arch_is_topdown_metrics(lhs) && !arch_is_topdown_metrics(rhs))
> > return -1;
> > - if (!arch_is_topdown_metrics(lhs) && arch_is_topdown_metrics(rhs))
> > + /*
> > + * Move topdown events forward only when topdown events
> > + * are not in same group with previous event.
> > + */
> > + if (!arch_is_topdown_metrics(lhs) && arch_is_topdown_metrics(rhs) &&
> > + lhs->core.leader != rhs->core.leader)
> > return 1;
> > ```
> > Is a broken comparator as the lhs then rhs behavior varies from the
> > rhs then lhs behavior. The qsort implementation can randomly order the
> > events.
> > Please drop/revert.
>
> Can you please provide an example when it's broken? I'm not sure how it
> can produce new errors, but it seems to fix a specific problem. Do you
> have a new test failure after this change?

It may work with a particular sort routine in a particular library
today, the issue is that if the sort routine were say changed from:

if (cmp(a, b) < 0)

to:

if (cmp(b, a) > 0)

the sort would vary with this change even though such a change in the
sorting code is a no-op. It is fundamental algorithms that this code
is broken, I'm not going to spend the time finding a list of
instructions and a sort routine to demonstrate this.

Thanks,
Ian