Re: [PATCH V16 7/7] perf intel-pt: Add a test for pause / resume

From: Ian Rogers
Date: Mon Dec 23 2024 - 15:26:28 EST


On Sun, Dec 22, 2024 at 11:17 AM Adrian Hunter <adrian.hunter@xxxxxxxxx> wrote:
>
> On 22/12/24 18:54, Ian Rogers wrote:
> > On Sat, Dec 21, 2024 at 11:30 AM Adrian Hunter <adrian.hunter@xxxxxxxxx> wrote:
> >>
> >> On 21/12/24 19:10, Ian Rogers wrote:
> >>> On Sun, Dec 15, 2024 at 11:03 PM Adrian Hunter <adrian.hunter@xxxxxxxxx> wrote:
> >>>>
> >>>> Add a simple sub-test to the "Miscellaneous Intel PT testing" test to
> >>>> check pause / resume.
> >>>>
> >>>> Signed-off-by: Adrian Hunter <adrian.hunter@xxxxxxxxx>
> >>>> Acked-by: Ian Rogers <irogers@xxxxxxxxxx>
> >>>> Reviewed-by: Andi Kleen <ak@xxxxxxxxxxxxxxx>
> >>>> ---
> >>>> tools/perf/tests/shell/test_intel_pt.sh | 28 +++++++++++++++++++++++++
> >>>> 1 file changed, 28 insertions(+)
> >>>>
> >>>> diff --git a/tools/perf/tests/shell/test_intel_pt.sh b/tools/perf/tests/shell/test_intel_pt.sh
> >>>> index e6f0070975f6..f3a9a040bacc 100755
> >>>> --- a/tools/perf/tests/shell/test_intel_pt.sh
> >>>> +++ b/tools/perf/tests/shell/test_intel_pt.sh
> >>>> @@ -644,6 +644,33 @@ test_pipe()
> >>>> return 0
> >>>> }
> >>>>
> >>>> +test_pause_resume()
> >>>> +{
> >>>> + echo "--- Test with pause / resume ---"
> >>>> + if ! perf_record_no_decode -o "${perfdatafile}" -e intel_pt/aux-action=start-paused/u uname ; then
> >>>> + echo "SKIP: pause / resume is not supported"
> >>>> + return 2
> >>>> + fi
> >>>> + if ! perf_record_no_bpf -o "${perfdatafile}" \
> >>>> + -e intel_pt/aux-action=start-paused/u \
> >>>> + -e instructions/period=50000,aux-action=resume,name=Resume/u \
> >>>> + -e instructions/period=100000,aux-action=pause,name=Pause/u uname ; then
> >>>> + echo "perf record with pause / resume failed"
> >>>> + return 1
> >>>> + fi
> >>>> + if ! perf script -i "${perfdatafile}" --itrace=b -Fperiod,event | \
> >>>> + awk 'BEGIN {paused=1;branches=0}
> >>>> + /Resume/ {paused=0}
> >>>> + /branches/ {if (paused) exit 1;branches=1}
> >>>> + /Pause/ {paused=1}
> >>>> + END {if (!branches) exit 1}' ; then
> >>>> + echo "perf record with pause / resume failed"
> >>>> + return 1
> >>>> + fi
> >>>> + echo OK
> >>>
> >>> Hi,
> >>>
> >>> this new test is now constantly making "Miscellaneous Intel PT testing" fail:
> >>>
> >>> ```
> >>> ...
> >>> --- Test with pause / resume ---
> >>> Error:
> >>> Failure to open event 'intel_pt/aux-action=start-paused/u' on PMU
> >>> 'intel_pt' which will be removed.
> >>> The 'aux_action' feature is not supported, update the kernel.
> >>> Linux
> >>> [ perf record: Woken up 1 times to write data ]
> >>> [ perf record: Captured and wrote 0.003 MB
> >>> /tmp/perf-test-intel-pt-sh.Hs8jcq0ADc/test-perf.data ]
> >>> Error:
> >>> Failure to open event 'intel_pt/aux-action=start-paused/u' on PMU
> >>> 'intel_pt' which will be removed.
> >>> The 'aux_action' feature is not supported, update the kernel.
> >>> Error:
> >>> Failure to open event 'Resume' on PMU 'cpu' which will be removed.
> >>> The 'aux_action' feature is not supported, update the kernel.
> >>> Error:
> >>> Failure to open event 'Pause' on PMU 'cpu' which will be removed.
> >>> The 'aux_action' feature is not supported, update the kernel.
> >>> Linux
> >>> [ perf record: Woken up 1 times to write data ]
> >>> [ perf record: Captured and wrote 0.005 MB
> >>> /tmp/perf-test-intel-pt-sh.Hs8jcq0ADc/test-perf.data ]
> >>> perf record with pause / resume failed
> >>> --- Cleaning up ---
> >>> ...
> >>> ```
> >>>
> >>> Should the fail be turned into a skip for missing kernel support?
> >>
> >> Seems to skip for me with perf-tools-next:
> >>
> >> --- Test with pause / resume ---
> >> Error:
> >> The 'aux_action' feature is not supported, update the kernel.
> >> SKIP: pause / resume is not supported
> >>
> >> perf version 6.13.rc2.g39c2547579aa
> >
> > My mistake, I thought I tested a clean client but I was testing with:
> > https://lore.kernel.org/lkml/20241221192654.94344-1-irogers@xxxxxxxxxx/
> > The issue there is that intel-pt is opening multiple dummy events in this test:
>
> >
> > That is a dummy event is added in 2 places:
> > https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/arch/x86/util/intel-pt.c?h=perf-tools-next#n800
> > https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/arch/x86/util/intel-pt.c?h=perf-tools-next#n865
> >
> > I'll need to update my patch set for things outside of record adding
> > dummy events, but I'm curious if the intel-pt code could share a
> > single dummy evsel?
>
> I doubt it. There are cases where capturing auxiliary events
> like text_poke or context_switch want to be system-wide while
> the workload task events (comm, task, mmap2) are not.

Thanks, I'll rework the dummy event detection because of this. I'm a
little concerned about APIs like `evlist__findnew_tracking_event`, nit
that we're inconsistent on naming (dummy vs tracking vs sideband),
perhaps we need an enum to capture the different
dummy/tracking/sideband event kinds for APIs like this, given the
expectation that there can be >1. Something I hope to do for uncore
evsels is to move the multiple PMUs into the evsel itself. My hope is
that will help with `perf stat` aggregation logic and the like, as
well as simplifying event sorting. Perhaps the evsel for a
dummy/tracking/sideband event can cover >1 use-case to reduce the
evlist's evsels.

Thanks,
Ian