Re: [PATCH v2 09/13] perf: cs-etm: Update record event to use new Trace ID protocol
From: Mike Leach
Date: Tue Aug 09 2022 - 12:13:33 EST
Hi James
On Wed, 20 Jul 2022 at 15:41, James Clark <james.clark@xxxxxxx> wrote:
>
>
>
> On 04/07/2022 09:11, Mike Leach wrote:
> > Trace IDs are now dynamically allocated.
> >
> > Previously used the static association algorithm that is no longer
> > used. The 'cpu * 2 + seed' was outdated and broken for systems with high
> > core counts (>46). as it did not scale and was broken for larger
> > core counts.
> >
> > Trace ID is as unknown in AUXINFO record, and the ID / CPU association
> > will now be sent in PERF_RECORD_AUX_OUTPUT_HW_ID record.
> >
> > Remove legacy Trace ID allocation algorithm.
> >
> > Signed-off-by: Mike Leach <mike.leach@xxxxxxxxxx>
> > ---
> > include/linux/coresight-pmu.h | 19 +++++++------------
> > tools/include/linux/coresight-pmu.h | 19 +++++++------------
>
> I usually see mentions that these header updates need to be separate commits
> because they are merged through different trees.
>
> > tools/perf/arch/arm/util/cs-etm.c | 21 ++++++++++++---------
> > 3 files changed, 26 insertions(+), 33 deletions(-)
> >
> > diff --git a/include/linux/coresight-pmu.h b/include/linux/coresight-pmu.h
> > index 4ac5c081af93..9f7ee380266b 100644
> > --- a/include/linux/coresight-pmu.h
> > +++ b/include/linux/coresight-pmu.h
> > @@ -8,7 +8,13 @@
> > #define _LINUX_CORESIGHT_PMU_H
> >
> > #define CORESIGHT_ETM_PMU_NAME "cs_etm"
> > -#define CORESIGHT_ETM_PMU_SEED 0x10
> > +
> > +/*
> > + * Metadata now contains an unused trace ID - IDs are transmitted using a
> > + * PERF_RECORD_AUX_OUTPUT_HW_ID record.
> > + * Value architecturally defined as reserved in CoreSight.
> > + */
> > +#define CS_UNUSED_TRACE_ID 0x7F
> >
> > /*
> > * Below are the definition of bit offsets for perf option, and works as
> > @@ -32,15 +38,4 @@
> > #define ETM4_CFG_BIT_RETSTK 12
> > #define ETM4_CFG_BIT_VMID_OPT 15
> >
> > -static inline int coresight_get_trace_id(int cpu)
> > -{
> > - /*
> > - * A trace ID of value 0 is invalid, so let's start at some
> > - * random value that fits in 7 bits and go from there. Since
> > - * the common convention is to have data trace IDs be I(N) + 1,
> > - * set instruction trace IDs as a function of the CPU number.
> > - */
> > - return (CORESIGHT_ETM_PMU_SEED + (cpu * 2));
> > -}
> > -
> > #endif
> > diff --git a/tools/include/linux/coresight-pmu.h b/tools/include/linux/coresight-pmu.h
> > index 6c2fd6cc5a98..31d007fab3a6 100644
> > --- a/tools/include/linux/coresight-pmu.h
> > +++ b/tools/include/linux/coresight-pmu.h
> > @@ -8,7 +8,13 @@
> > #define _LINUX_CORESIGHT_PMU_H
> >
> > #define CORESIGHT_ETM_PMU_NAME "cs_etm"
> > -#define CORESIGHT_ETM_PMU_SEED 0x10
> > +
> > +/*
> > + * Metadata now contains an unused trace ID - IDs are transmitted using a
> > + * PERF_RECORD_AUX_OUTPUT_HW_ID record.
> > + * Value architecturally defined as reserved in CoreSight.
> > + */
> > +#define CS_UNUSED_TRACE_ID 0x7F
> >
>
> minor nit: this isn't used in the kernel so only needs to be defined on the
> tools side.
>
Unfortunately if the two versions of coresight-pmu.h are different,
the build process for perf throws out a warning. So they have to be
identical.
Thanks
Mike
> > /*
> > * Below are the definition of bit offsets for perf option, and works as
> > @@ -34,15 +40,4 @@
> > #define ETM4_CFG_BIT_RETSTK 12
> > #define ETM4_CFG_BIT_VMID_OPT 15
> >
> > -static inline int coresight_get_trace_id(int cpu)
> > -{
> > - /*
> > - * A trace ID of value 0 is invalid, so let's start at some
> > - * random value that fits in 7 bits and go from there. Since
> > - * the common convention is to have data trace IDs be I(N) + 1,
> > - * set instruction trace IDs as a function of the CPU number.
> > - */
> > - return (CORESIGHT_ETM_PMU_SEED + (cpu * 2));
> > -}
> > -
> > #endif
> > diff --git a/tools/perf/arch/arm/util/cs-etm.c b/tools/perf/arch/arm/util/cs-etm.c
> > index 1b54638d53b0..2d68e6a722ed 100644
> > --- a/tools/perf/arch/arm/util/cs-etm.c
> > +++ b/tools/perf/arch/arm/util/cs-etm.c
> > @@ -421,13 +421,16 @@ static int cs_etm_recording_options(struct auxtrace_record *itr,
> > evlist__to_front(evlist, cs_etm_evsel);
> >
> > /*
> > - * In the case of per-cpu mmaps, we need the CPU on the
> > - * AUX event. We also need the contextID in order to be notified
> > + * get the CPU on the sample - need it to associate trace ID in the
> > + * AUX_OUTPUT_HW_ID event, and the AUX event for per-cpu mmaps.
> > + */
> > + evsel__set_sample_bit(cs_etm_evsel, CPU);
> > +
> > + /*
> > + * Also the case of per-cpu mmaps, need the contextID in order to be notified
> > * when a context switch happened.
> > */
> > if (!perf_cpu_map__empty(cpus)) {
> > - evsel__set_sample_bit(cs_etm_evsel, CPU);
> > -
> > err = cs_etm_set_option(itr, cs_etm_evsel,
> > BIT(ETM_OPT_CTXTID) | BIT(ETM_OPT_TS));
> > if (err)
> > @@ -633,8 +636,9 @@ static void cs_etm_save_etmv4_header(__u64 data[], struct auxtrace_record *itr,
> >
> > /* Get trace configuration register */
> > data[CS_ETMV4_TRCCONFIGR] = cs_etmv4_get_config(itr);
> > - /* Get traceID from the framework */
> > - data[CS_ETMV4_TRCTRACEIDR] = coresight_get_trace_id(cpu);
> > + /* traceID set to unused */
> > + data[CS_ETMV4_TRCTRACEIDR] = CS_UNUSED_TRACE_ID;
> > +
> > /* Get read-only information from sysFS */
> > data[CS_ETMV4_TRCIDR0] = cs_etm_get_ro(cs_etm_pmu, cpu,
> > metadata_etmv4_ro[CS_ETMV4_TRCIDR0]);
> > @@ -681,9 +685,8 @@ static void cs_etm_get_metadata(int cpu, u32 *offset,
> > magic = __perf_cs_etmv3_magic;
> > /* Get configuration register */
> > info->priv[*offset + CS_ETM_ETMCR] = cs_etm_get_config(itr);
> > - /* Get traceID from the framework */
> > - info->priv[*offset + CS_ETM_ETMTRACEIDR] =
> > - coresight_get_trace_id(cpu);
> > + /* traceID set to unused */
> > + info->priv[*offset + CS_ETM_ETMTRACEIDR] = CS_UNUSED_TRACE_ID;
> > /* Get read-only information from sysFS */
> > info->priv[*offset + CS_ETM_ETMCCER] =
> > cs_etm_get_ro(cs_etm_pmu, cpu,
--
Mike Leach
Principal Engineer, ARM Ltd.
Manchester Design Centre. UK