Re: [PATCH v1 5/8] perf test: Tag parallel failing shell tests with "(exclusive)"
From: Ian Rogers
Date: Fri Oct 11 2024 - 12:52:34 EST
On Fri, Oct 11, 2024 at 3:29 AM James Clark <james.clark@xxxxxxxxxx> wrote:
>
>
>
> On 11/10/2024 11:01 am, James Clark wrote:
> >
> >
> > On 11/10/2024 8:35 am, Ian Rogers wrote:
> >> Some shell tests compete for resources and so can't run with other
> >> tests, tag such tests. The "(exclusive)" stems from shared/exclusive
> >> to describe how the tests run as if holding a lock.
> >>
> >> Signed-off-by: Ian Rogers <irogers@xxxxxxxxxx>
> >> ---
> >> tools/perf/tests/shell/perftool-testsuite_report.sh | 2 +-
> >> tools/perf/tests/shell/record.sh | 2 +-
> >> tools/perf/tests/shell/record_lbr.sh | 2 +-
> >> tools/perf/tests/shell/record_offcpu.sh | 2 +-
> >> tools/perf/tests/shell/stat_all_pmu.sh | 2 +-
> >> tools/perf/tests/shell/test_intel_pt.sh | 2 +-
> >> tools/perf/tests/shell/test_stat_intel_tpebs.sh | 2 +-
> >> 7 files changed, 7 insertions(+), 7 deletions(-)
> >>
> >
> > The following ones would also need to be marked as exclusive, not sure
> > if you can include those here or you want me to send a patch:
> >
> > tools/perf/tests/shell/coresight/asm_pure_loop.sh
> > tools/perf/tests/shell/coresight/memcpy_thread_16k_10.sh
> > tools/perf/tests/shell/coresight/thread_loop_check_tid_10.sh
> > tools/perf/tests/shell/coresight/thread_loop_check_tid_2.sh
> > tools/perf/tests/shell/coresight/unroll_loop_thread_10.sh
> > tools/perf/tests/shell/test_arm_coresight.sh
> > tools/perf/tests/shell/test_arm_coresight_disasm.sh
> > tools/perf/tests/shell/test_arm_spe.sh
I'll add it to v2 and add your suggested-by. Thanks.
> > In theory all tests using probes would also need to be exclusive because
> > they install and delete probes globally. In practice I don't think I saw
> > any failures, whether that's just luck or because of some skips I'm not
> > sure.
> >
> > And this one fails consistently in parallel mode on Arm:
> >
> > 22: Number of exit events of a simple workload
> > : FAILED!
This looks like it could be a real issue. I believe the test is doing
uid filtering:
https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/tests/task-exit.c?h=perf-tools-next#n49
uid filtering scans /proc looking for processes of the given uid. This
is inherently racy with processes exiting and we'd be better using a
BPF filter to drop samples with the wrong uid - same effect but no
racy /proc scan. I've seen the racy /proc scan cause termination
issues, so possibly this is the issue you are seeing.
It could also be that tweaking the retry count will fix things:
https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/tests/task-exit.c?h=perf-tools-next#n134
Anyway, for now I think it is expedient to mark the test as exclusive.
> > But it's a C test so I assume there isn't an exclusive mechanism to skip
> > it? It doesn't look like it should be affected though, so maybe we could
> > leave it failing as a real bug.
> >
>
> Oh I see it says in the cover letter it can be set for C tests. But can
> that be done through all the existing TEST_CASE() etc macros?
Currently only whole suites can be exclusive. We could add macros for
exclusive C tests but my preference would be to make the test work
non-exclusive. I'll make test cases exclusive and mark this one.
Thanks,
Ian