Re: [PATCH v3 04/13] perf tests: Add robust record retry helper and use subsecond workloads
From: Namhyung Kim
Date: Tue Jun 23 2026 - 02:06:30 EST
On Mon, Jun 22, 2026 at 04:59:15PM -0700, Ian Rogers wrote:
> On Thu, Jun 18, 2026 at 6:25 AM Arnaldo Carvalho de Melo
> <acme@xxxxxxxxxx> wrote:
> >
> > On Wed, Jun 17, 2026 at 03:37:06PM -0700, Namhyung Kim wrote:
> > > On Tue, Jun 16, 2026 at 09:48:09AM -0700, Ian Rogers wrote:
> > > > Introduce `perf_record_with_retry` and `perf_record_cleanup` in a shared
> > > > library `tests/shell/lib/perf_record.sh` to prevent record test failures
> > > > caused by transient recording or workload delays.
> > > >
> > > > Update `record.sh`, `record_lbr.sh`, `pipe_test.sh`, `kvm.sh`, and
> > > > `stat_all_pfm.sh` to use this robust record retry logic. These tests now
> > > > start with very short durations (e.g. 0.01 seconds) and scale up if the
> > > > initial recording failed to capture samples, significantly improving test
> > > > execution speed on success while remaining resilient to slow systems.
> > > >
> > > > Assisted-by: Antigravity:gemini-3.1-pro
> > > > Signed-off-by: Ian Rogers <irogers@xxxxxxxxxx>
> > > > ---
> > > [SNIP]
> > > > @@ -60,22 +71,29 @@ test_kvm_stat() {
> > > > test_kvm_record_report() {
> > > > echo "Testing perf kvm record/report"
> > > >
> > > > - echo "Recording kvm profile for pid ${qemu_pid}..."
> > > > - # Use --host to avoid needing guest symbols/mounts for this simple test
> > > > - # We just want to verify the command runs and produces data
> > > > - # We run in background and kill it because 'perf kvm record' appends options
> > > > - # after the command, which breaks 'sleep' (e.g. it gets '-e cycles').
> > > > - perf kvm --host record -p "${qemu_pid}" -o "${perfdata}" &
> > > > - rec_pid=$!
> > > > - sleep 1
> > > > - kill -INT "${rec_pid}"
> > > > - wait "${rec_pid}" || true
> > > > + local duration
> > > > + local success=false
> > > > + for duration in 1 2 4 8; do
> > > > + echo "Recording kvm profile for pid ${qemu_pid} (duration ${duration}s)..."
> > > > + rm -f "${perfdata}" "${perfdata}".old
> > > > +
> > > > + perf kvm --host record -p "${qemu_pid}" -o "${perfdata}" &
> > > > + local rec_pid=$!
> > > > + sleep ${duration}
> > > > + kill -INT "${rec_pid}"
> > > > + wait "${rec_pid}" || true
> > >
> > > Can this be just like below?
> > >
> > > perf kvm --host record -p "${qemu_pid}" -o "${perfdata}" sleep ${duration}
> >
> > Right, looks equivalent and simpler,
> >
> > I was making notes to address the flakiness that is making me use the
> > end summary of entries that fail (great feature) to run them in
> > isolation to check if they work that way, which most do :-\
>
> Here is an explanation from antigravity (sorry I have some typing
> issues atm), but it comes down to how the command line is built up:
>
> Normally, we could just write:
>
> perf kvm --host record -p "qemuₚid" - o"{perfdata}" sleep
> ${duration}
>
> However, on Intel x86 architectures, if no -e option is supplied
> to perf kvm record , it calls architecture-specific setup logic (
> __kvm_add_default_arch_event_x86() ) which automatically appends "-e"
> and "cycles" to the end of the argv array.
> Because perf record option parsing stops at the first non-option
> (which is "sleep" ), any arguments appended after the workload name
> are forwarded directly to the workload command. This turns the
> executed command into:
>
> sleep 1 -e cycles
>
> Which causes sleep to crash with:
> sleep: invalid option -- 'e'
>
> I'll revise this in v4 to be clearer, but we should probably also fix
> the option parsing issue.
Thanks for the explanation! I found it's in the deleted comment.
Let me think about the option parsing issue.
Thanks,
Namhyung