Re: [RFC PATCH 1/2] perf: arm_spe: Fix consistency of PMSCR register bit CX

From: German Gomez
Date: Tue Feb 15 2022 - 09:30:28 EST



On 11/02/2022 10:45, Leo Yan wrote:
> Hi German,
>
> On Thu, Feb 10, 2022 at 05:23:50PM +0000, German Gomez wrote:
>
> [...]
>
>>>>>> One way to fix this is by caching the value of the CX bit during the
>>>>>> initialization of the PMU event, so that it remains consistent for the
>>>>>> duration of the session.
>>>>>>
>>>>>> [...]
>>> So the patch makes sense to me. Just a minor comment:
>>>
>>> Here we can define a u64 for recording pmscr value rather than a
>>> bool value.
>>>
>>> struct arm_spe_pmu {
>>> ...
>>> u64 pmscr;
>>> };
>> I agree with the comment from Will that it makes more sense to store the
>> value of the register in the perf_event somehow (due to misunderstanding
>> from my side, I thought arm_spe_pmu struct was local to the session).
> It's shame that I miss this point :) As you said, struct arm_spe_pmu is
> a data structure for Arm SPE device driver instance and it's not
> allocated for perf session.
>
>> What about perf_event's void *pmu_private?
> Before we use perf_event::pmu_private, could you check the data
> structure arm_spe_pmu_buf firstly? This data structure is allocated
> when setup AUX ring buffer (so it's allocated for perf session).
> IIUC, the function arm_spe_pmu_setup_aux() will be invoked in the perf
> process, so it's good for us to initialize pmscr in this function.
Thanks for the suggestion. I recorded the following stacktrace:

 perf-323841 [052] d.... 3996.528812: arm_spe_pmu_setup_aux: (arm_spe_pmu_setup_aux+0x60/0x1c0 [arm_spe_pmu])
 perf-323841 [052] d.... 3996.528813: <stack trace>
 => kprobe_dispatcher
 => kprobe_breakpoint_handler
 => call_break_hook
 => brk_handler
 => do_debug_exception
 => el1_dbg
 => el1h_64_sync_handler
 => el1h_64_sync
 => arm_spe_pmu_setup_aux
 => perf_mmap
 => mmap_region
 => do_mmap
 => vm_mmap_pgoff
 => ksys_mmap_pgoff
 => __arm64_sys_mmap
 => invoke_syscall
 => el0_svc_common.constprop.0
 => do_el0_svc
 => el0_svc
 => el0t_64_sync_handler
 => el0t_64_sync

So for a v2 I may include something like this:

diff --git a/drivers/perf/arm_spe_pmu.c b/drivers/perf/arm_spe_pmu.c
index d44bcc29d..aadec5a0e 100644
--- a/drivers/perf/arm_spe_pmu.c
+++ b/drivers/perf/arm_spe_pmu.c
@@ -45,6 +45,7 @@ struct arm_spe_pmu_buf {
     int                    nr_pages;
     bool                    snapshot;
     void                    *base;
+    u64                    pmscr;
 };
 
 struct arm_spe_pmu {
@@ -748,7 +749,7 @@ static void arm_spe_pmu_start(struct perf_event *event, int flags)
         write_sysreg_s(reg, SYS_PMSICR_EL1);
     }
 
-    reg = arm_spe_event_to_pmscr(event);
+    reg = ((struct arm_spe_pmu_buf *) perf_get_aux(handle))->pmscr;
     isb();
     write_sysreg_s(reg, SYS_PMSCR_EL1);
 }
@@ -855,6 +856,8 @@ static void *arm_spe_pmu_setup_aux(struct perf_event *event, void **pages,
     if (!pglist)
         goto out_free_buf;
 
+    buf->pmscr = arm_spe_event_to_pmscr(event);
+
     for (i = 0; i < nr_pages; ++i)
         pglist[i] = virt_to_page(pages[i]);

>
> Thanks,
> Leo