[PATCH v1 00/14] perf test: Harness improvements

From: Ian Rogers

Date: Wed May 13 2026 - 19:05:13 EST

Motivation & Key Enhancements

1. **Test Harness Acceleration & Parallel Polling**

Previously, when running tests in parallel mode (`perf test -v`),
child processes writing massive amounts of logging output to pipes
(such as Granite Rapids PMU metric parsing) would saturate the 64KB
pipe buffer and block indefinitely. The parent harness only polled
the pipe of the "current" sequential test waiting to be printed,
causing severe execution bottlenecks.

- Refactored the parallel poll loop to drain output pipes from all
active children simultaneously into dynamic per-child buffers
(`struct strbuf`). Reaping occurs asynchronously out of order,
while final console printing remains strictly sequential.

- Added explicit pipe draining after child process termination to
prevent losing trailing log data.

- **Benchmark**: This drops parallel verbose execution time for the
PMU events suite from ~35 seconds down to ~5.9 seconds (an ~83%
reduction in latency).

2. **Dynamic Test Suites & Granular PMU Subtests**

Monolithic test cases (like "Parsing of PMU event table metrics")
previously evaluated hundreds of tables in a single sequential run,
making failures difficult to isolate.

- Added `setup` callbacks and private data pointers (`void *priv`)
to `struct test_suite` and `struct test_case`, enabling dynamic
runtime testcase generation.

- Split the PMU events metric parsing test into individual subtests
(one pair of real/fake PMU tests per metric table), allowing them
to execute concurrently and report granular results.

3. **Advanced Triaging & Automated Summary Reporting**

Triaging failures in highly verbose automated runs previously
required scrolling through thousands of lines of console output.

- Introduced a smart, configurable failure snippet processor
(`--failure-snippet-lines`) that dynamically extracts root-cause
context lines matching failure keywords (`error`, `fail`, `segv`,
`abort`) while preserving outline markers.

- Implemented an automated global execution summary printed at the
absolute tail of the test run, presenting clear pass/skip/fail
totals alongside an explicit list of failed test cases for
effortless cross-referencing.

- Fixed subtest status column alignment (`: Ok`) for multi-digit
test indexes.

- Updated shell script SPDX header parsing to prevent license
strings from being incorrectly extracted as test descriptions.

4. **JUnit XML Reporting & CI Integration**

Added a `-j`/`--junit` command-line option to generate standard
JUnit XML test reports (`test.xml`).

- Captures individual test suite and subtest execution latency
alongside XML-escaped failure logs and skip reasons.

- Guarantees absolute timing precision and immunity to wall-clock
jumps by measuring durations using
`clock_gettime(CLOCK_MONOTONIC)` and harvesting `end_time`
exactly when child processes exit to insulate latencies from
out-of-order sequential UI printing delays.

- Added a standalone shell test script to validate generated JUnit
XML syntax using Python's `ElementTree` parser.

5. **Elimination of External C Compiler Dependencies**

The Intel PT shell test (`test_intel_pt.sh`) previously compiled
external C workloads at runtime using `/usr/bin/cc`, which
frequently breaks in hermetic or minimal CI environments.

- Created a built-in self-modifying JIT workload (`perf test -w
jitdump`) and switched the script to use built-in workloads.

- To guarantee robust multi-architecture compatibility without
external C compiler dependencies, the JIT workload immediate
instruction arrays dynamically encode `CHK_BYTE` into opcodes
across x86, ARM32, ARM64, RISC-V, PowerPC, MIPS, LoongArch, and
s390x, with clean `#else` fallbacks for unsupported
architectures.

Ian Rogers (14):
perf jevents.py: Make generated C code more kernel style
perf pmu-events: Add API to get metric table name and iterate tables
perf test: Drain pipe after child finishes to avoid losing output
perf test: Support dynamic test suites with setup callback and private
data
perf test pmu-events: A sub-test per metric table
perf test: Refactor parallel poll loop to drain all pipes
simultaneously
perf test: Show snippet failure output for verbose=1
perf test: Add summary reporting
perf test: Fix subtest status alignment for multi-digit indexes
perf test: Skip shebang and SPDX comments in shell test descriptions
perf test: Split monolithic 'util' test suite into sub-tests
perf test: Add -j/--junit option for JUnit XML test reports
perf test: Add shell test to validate JUnit XML reporting output
perf test: Remove /usr/bin/cc dependency from Intel PT shell test

tools/lib/subcmd/run-command.c | 4 +-
tools/perf/pmu-events/empty-pmu-events.c | 8811 +++++++++++------
tools/perf/pmu-events/jevents.py | 836 +-
tools/perf/pmu-events/pmu-events.h | 4 +
tools/perf/tests/builtin-test.c | 587 +-
tools/perf/tests/pmu-events.c | 155 +-
tools/perf/tests/shell/test_intel_pt.sh | 169 +-
.../tests/shell/test_test_junit_output.sh | 63 +
tools/perf/tests/tests-scripts.c | 63 +-
tools/perf/tests/tests.h | 3 +
tools/perf/tests/util.c | 20 +-
tools/perf/tests/workloads/Build | 1 +
tools/perf/tests/workloads/jitdump.c | 165 +
tools/perf/util/jitdump.h | 3 +-
14 files changed, 7187 insertions(+), 3697 deletions(-)
create mode 100755 tools/perf/tests/shell/test_test_junit_output.sh
create mode 100644 tools/perf/tests/workloads/jitdump.c

--
2.54.0.563.g4f69b47b94-goog