Re: [PATCH v6 0/8] perf/amd: Zen4 IBS extensions support (tool changes)
From: Namhyung Kim
Date: Mon Jun 06 2022 - 19:47:15 EST
Hi Ravi,
On Fri, Jun 3, 2022 at 9:46 PM Ravi Bangoria <ravi.bangoria@xxxxxxx> wrote:
>
> Kernel side of changes have already been applied to linus/master
> (except amd-ibs.h header). This series contains perf tool changes.
>
> v5: https://lore.kernel.org/lkml/20220601032608.1034-1-ravi.bangoria@xxxxxxx
> v5->v6:
> - Use macros instead of magic numbers for IBS l3missonly bits
> - Use asprintf() instead of allocating memory and copying data manually
> - Add Reviewed-by Kan Liang (patch 2-5).
>
> Patches prepared on acme/perf/core (9dde6cadb92b5)
>
> Original cover letter:
>
> IBS support has been enhanced with two new features in upcoming uarch:
> 1. DataSrc extension and 2. L3 Miss Filtering capability. Both are
> indicated by CPUID_Fn8000001B_EAX bit 11.
>
> DataSrc extension provides additional data source details for tagged
> load/store operations. Add support for these new bits in perf report/
> script raw-dump.
>
> IBS L3 miss filtering works by tagging an instruction on IBS counter
> overflow and generating an NMI if the tagged instruction causes an L3
> miss. Samples without an L3 miss are discarded and counter is reset
> with random value (between 1-15 for fetch pmu and 1-127 for op pmu).
> This helps in reducing sampling overhead when user is interested only
> in such samples. One of the use case of such filtered samples is to
> feed data to page-migration daemon in tiered memory systems.
>
> Add support for L3 miss filtering in IBS driver via new pmu attribute
> "l3missonly". Example usage:
>
> # perf record -a -e ibs_op/l3missonly=1/ --raw-samples sleep 5
> # perf report -D
>
> Some important points to keep in mind while using L3 miss filtering:
> 1. Hw internally reset sampling period when tagged instruction does
> not cause L3 miss. But there is no way to reconstruct aggregated
> sampling period when this happens.
> 2. L3 miss is not the actual event being counted. Rather, IBS will
> count fetch, cycles or uOps depending on the configuration. Thus
> sampling period have no direct connection to L3 misses.
>
> 1st causes sampling period skew. Thus, I've added warning message at
> perf record:
>
> # perf record -c 10000 -C 0 -e ibs_op/l3missonly=1/
> WARNING: Hw internally resets sampling period when L3 Miss Filtering is enabled
> and tagged operation does not cause L3 Miss. This causes sampling period skew.
>
> User can configure smaller sampling period to get more samples while
> using l3missonly.
>
>
> Ravi Bangoria (8):
> perf record ibs: Warn about sampling period skew
> perf tool: Parse pmu caps sysfs only once
> perf headers: Pass "cpu" pmu name while printing caps
> perf headers: Store pmu caps in an array of strings
> perf headers: Record non-cpu pmu capabilities
> perf/x86/ibs: Add new IBS register bits into header
> perf tool ibs: Sync amd ibs header file
> perf script ibs: Support new IBS bits in raw trace dump
Acked-by: Namhyung Kim <namhyung@xxxxxxxxxx>
Thanks,
Namhyung
>
> arch/x86/include/asm/amd-ibs.h | 16 +-
> tools/arch/x86/include/asm/amd-ibs.h | 16 +-
> .../Documentation/perf.data-file-format.txt | 10 +-
> tools/perf/arch/x86/util/evsel.c | 52 +++++
> tools/perf/builtin-inject.c | 2 +-
> tools/perf/util/amd-sample-raw.c | 68 ++++++-
> tools/perf/util/env.c | 62 +++++-
> tools/perf/util/env.h | 14 +-
> tools/perf/util/evsel.c | 7 +
> tools/perf/util/evsel.h | 1 +
> tools/perf/util/header.c | 189 ++++++++++--------
> tools/perf/util/header.h | 2 +-
> tools/perf/util/pmu.c | 15 +-
> tools/perf/util/pmu.h | 2 +
> 14 files changed, 329 insertions(+), 127 deletions(-)
>
> --
> 2.31.1
>