Re: [RFC] perf arm-spe: Track task context switch for cpu-mode events
From: Leo Yan
Date: Thu Sep 16 2021 - 09:54:44 EST
Hi Namhyung,
On Wed, Sep 15, 2021 at 05:17:48PM -0700, Namhyung Kim wrote:
> When perf report synthesize events from ARM SPE data, it refers to
> current cpu, pid and tid in the machine. But there's no place to set
> them in the ARM SPE decoder. I'm seeing all pid/tid is set to -1 and
> user symbols are not resolved in the output.
>
> # perf record -a -e arm_spe_0/ts_enable=1/ sleep 1
>
> # perf report -q | head
> 8.77% 8.77% :-1 [kernel.kallsyms] [k] format_decode
> 7.02% 7.02% :-1 [kernel.kallsyms] [k] seq_printf
> 7.02% 7.02% :-1 [unknown] [.] 0x0000ffff9f687c34
> 5.26% 5.26% :-1 [kernel.kallsyms] [k] vsnprintf
> 3.51% 3.51% :-1 [kernel.kallsyms] [k] string
> 3.51% 3.51% :-1 [unknown] [.] 0x0000ffff9f66ae20
> 3.51% 3.51% :-1 [unknown] [.] 0x0000ffff9f670b3c
> 3.51% 3.51% :-1 [unknown] [.] 0x0000ffff9f67c040
> 1.75% 1.75% :-1 [kernel.kallsyms] [k] ___cache_free
> 1.75% 1.75% :-1 [kernel.kallsyms] [k] __count_memcg_events
>
> Like Intel PT, add context switch records to track task info. As ARM
> SPE support was added later than PERF_RECORD_SWITCH_CPU_WIDE, I think
> we can safely set the attr.context_switch bit and use it.
Thanks for the patch.
Before we had discussion for enabling PID/TID for SPE samples; in the patch
set [1], patches 07, 08 set sample's pid/tid based on the Arm SPE context
packets. To enable hardware tracing context ID, you also needs to enable
kernel config CONFIG_PID_IN_CONTEXTIDR.
At that time, there have a concern is the hardware context ID might
introduce confusion for non-root namespace.
We also considered to use PERF_RECORD_SWITCH_CPU_WIDE event for setting
pid/tid, the Intel PT implementation uses two things to set sample's
pid/tid: one is PERF_RECORD_SWITCH_CPU_WIDE event and another is to detect
the branch instruction is the symbol "__switch_to". Since the trace
event PERF_RECORD_SWITCH_CPU_WIDE is coarse, so it only uses the new
pid/tid after the branch instruction for "__switch_to". Arm SPE is
'statistical', thus it cannot promise the trace data must contain the
branch instruction for "__switch_to", please see details [2].
I think the feasible way is to use CONTEXTIDR to trace PID/TID _only_
for root namespace, and the perf tool uses context packet to set
pid/tid for samples. So except we need patches 07 and 08, we also
need a change in Arm SPE driver as below:
diff --git a/drivers/perf/arm_spe_pmu.c b/drivers/perf/arm_spe_pmu.c
index d44bcc29d99c..2553d53d3772 100644
--- a/drivers/perf/arm_spe_pmu.c
+++ b/drivers/perf/arm_spe_pmu.c
@@ -272,7 +272,9 @@ static u64 arm_spe_event_to_pmscr(struct perf_event *event)
if (!attr->exclude_kernel)
reg |= BIT(SYS_PMSCR_EL1_E1SPE_SHIFT);
- if (IS_ENABLED(CONFIG_PID_IN_CONTEXTIDR) && perfmon_capable())
+ /* Only enable context ID tracing for root namespace */
+ if (IS_ENABLED(CONFIG_PID_IN_CONTEXTIDR) && perfmon_capable() &&
+ (task_active_pid_ns(current) == &init_pid_ns))
reg |= BIT(SYS_PMSCR_EL1_CX_SHIFT);
return reg;
Could you confirm if this works for you? If it's okay for you, I will
sync with James for upstreaming the changes.
Thanks,
Leo
[1] https://lore.kernel.org/lkml/20210119144658.793-8-james.clark@xxxxxxx/
[2] https://lore.kernel.org/lkml/20210204102734.GA4737@leoy-ThinkPad-X240s/