Re: [PATCH] perf test: Remove atomics from test_loop to avoid test failures

From: Nick Forrington
Date: Sat Nov 25 2023 - 14:10:34 EST



On 25/11/2023 03:05, Leo Yan wrote:
Hi all,

On Fri, Nov 24, 2023 at 08:57:52PM +0100, Michael Petlan wrote:
On Thu, 2 Nov 2023, Nick Forrington wrote:
The current use of atomics can lead to test failures, as tests (such as
tests/shell/record.sh) search for samples with "test_loop" as the
top-most stack frame, but find frames related to the atomic operation
(e.g. __aarch64_ldadd4_relax).
I am confused by above description. As I went through the script
record.sh, which is the only test invoking the program 'test_loop',
but I don't find any test is related with stack frame.

Do I miss anything? I went through record.sh but no clue why the
failure is caused by stack frame. All the testings use command:

if ! perf report -i "${perfdata}" -q | grep -q "${testsym}"
...
fi

@Nick, could you narrow down which specific test case causing the
failure.

[...]


All checks for ${testsym} in record.sh (including the example you provide) can fail, as the expected symbol (test_loop) is not the top-most function on the stack (and therefore not the symbol associated with the sample).


Example perf report output:

# Overhead  Command  Shared Object          Symbol
# ........  .......  ..................... .............................
#
    99.53%  perf     perf                   [.] __aarch64_ldadd4_relax

...


You can see the issue when recording/reporting with call stacks:

# Children      Self  Command  Shared Object          Symbol
# ........  ........  .......  ..................... ..........................................................
#
    99.52%    99.52%  perf     perf                   [.] __aarch64_ldadd4_relax
            |
            |--49.77%--0xffffb905a5dc
            |          0xffffb8ff0aec
            |          thfunc
            |          test_loop
            |          __aarch64_ldadd4_relax

...


I believe that it was there to prevent the compiler to optimize the loop
out or some reason like that. Hopefully, it will work even without that
on all architectures with all compilers that are used for building perf...
Agreed.

As said above, I'd like to step back a bit for making clear what's the
exactly failure caused by the program.


I don't think this loop could be sensibly optimised away, as it depends on "done", which is defined at file scope (and assigned by a signal handler).


Cheers,
Nick


Thanks,
Leo