Re: [Patch v5 0/6] Bug fixes on topdown events reordering
From: Liang, Kan
Date: Thu Oct 03 2024 - 10:57:47 EST
On 2024-10-02 8:57 p.m., Ian Rogers wrote:
> On Wed, Oct 2, 2024 at 5:00 PM Namhyung Kim <namhyung@xxxxxxxxxx> wrote:
>>
>> On Tue, Oct 01, 2024 at 03:32:04PM -0700, Ian Rogers wrote:
>>> On Tue, Oct 1, 2024 at 2:02 PM Namhyung Kim <namhyung@xxxxxxxxxx> wrote:
>>>>
>>>> On Fri, 13 Sep 2024 08:47:06 +0000, Dapeng Mi wrote:
>>>>
>>>>> Changes:
>>>>> v5 -> v6:
>>>>> * no function change.
>>>>> * rebase patchset to latest code of perf-tool-next tree.
>>>>> * Add Kan's reviewed-by tag.
>>>>>
>>>>> History:
>>>>> v4: https://lore.kernel.org/all/20240816122938.32228-1-dapeng1.mi@xxxxxxxxxxxxxxx/
>>>>> v3: https://lore.kernel.org/all/20240712170339.185824-1-dapeng1.mi@xxxxxxxxxxxxxxx/
>>>>> v2: https://lore.kernel.org/all/20240708144204.839486-1-dapeng1.mi@xxxxxxxxxxxxxxx/
>>>>> v1: https://lore.kernel.org/all/20240702224037.343958-1-dapeng1.mi@xxxxxxxxxxxxxxx/
>>>>>
>>>>> [...]
>>>>
>>>> Applied to perf-tools-next, thanks!
>>>
>>> I disagreed with an early patch set and the issue wasn't resolved. Specifically:
>>>
>>> https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/commit/?h=perf-tools-next&id=3b5edc0421e2598a0ae7f0adcd592017f37e3cdf
>>> ```
>>> /* Followed by topdown events. */
>>> if (arch_is_topdown_metrics(lhs) && !arch_is_topdown_metrics(rhs))
>>> return -1;
>>> - if (!arch_is_topdown_metrics(lhs) && arch_is_topdown_metrics(rhs))
>>> + /*
>>> + * Move topdown events forward only when topdown events
>>> + * are not in same group with previous event.
>>> + */
>>> + if (!arch_is_topdown_metrics(lhs) && arch_is_topdown_metrics(rhs) &&
>>> + lhs->core.leader != rhs->core.leader)
>>> return 1;
>>> ```
>>> Is a broken comparator as the lhs then rhs behavior varies from the
>>> rhs then lhs behavior. The qsort implementation can randomly order the
>>> events.
>>> Please drop/revert.
>>
>> Can you please provide an example when it's broken? I'm not sure how it
>> can produce new errors, but it seems to fix a specific problem. Do you
>> have a new test failure after this change?
>
> It may work with a particular sort routine in a particular library
> today, the issue is that if the sort routine were say changed from:
>
> if (cmp(a, b) < 0)
>
> to:
>
> if (cmp(b, a) > 0)
>
> the sort would vary with this change even though such a change in the
> sorting code is a no-op. It is fundamental algorithms that this code
> is broken, I'm not going to spend the time finding a list of
> instructions and a sort routine to demonstrate this.
I'm not sure I fully understand. But just want to mention that the
change only impacts the Topdown perf metric group, which is only
available for the ICL and later p-core. Nothing will change if no perf
metrics topdown events are used. Very limited impact.
I like the patch set is because it provides examples and tests to cover
the possible combination of the slots event and topdown metrics events.
So we will have a clear expectation for different combinations caused by
the perf metrics mess.
If the algorithms cannot be changed, can you please give some
suggestions, especially for the sample read failure?
Thanks,
Kan