Re: [PATCH 11/20] perf/x86/intel: Setup PEBS constraints base on counter & pdist map
From: Mi, Dapeng
Date: Thu Feb 06 2025 - 20:27:31 EST
On 2/6/2025 11:01 PM, Liang, Kan wrote:
>
> On 2025-02-05 9:47 p.m., Mi, Dapeng wrote:
>> On 1/28/2025 12:07 AM, Liang, Kan wrote:
>>> On 2025-01-23 9:07 a.m., Dapeng Mi wrote:
>>>> arch-PEBS provides CPUIDs to enumerate which counters support PEBS
>>>> sampling and precise distribution PEBS sampling. Thus PEBS constraints
>>>> can be dynamically configured base on these counter and precise
>>>> distribution bitmap instead of defining them statically.
>>>>
>>>> Signed-off-by: Dapeng Mi <dapeng1.mi@xxxxxxxxxxxxxxx>
>>>> ---
>>>> arch/x86/events/intel/core.c | 20 ++++++++++++++++++++
>>>> arch/x86/events/intel/ds.c | 1 +
>>>> 2 files changed, 21 insertions(+)
>>>>
>>>> diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
>>>> index 7775e1e1c1e9..0f1be36113fa 100644
>>>> --- a/arch/x86/events/intel/core.c
>>>> +++ b/arch/x86/events/intel/core.c
>>>> @@ -3728,6 +3728,7 @@ intel_get_event_constraints(struct cpu_hw_events *cpuc, int idx,
>>>> struct perf_event *event)
>>>> {
>>>> struct event_constraint *c1, *c2;
>>>> + struct pmu *pmu = event->pmu;
>>>>
>>>> c1 = cpuc->event_constraint[idx];
>>>>
>>>> @@ -3754,6 +3755,25 @@ intel_get_event_constraints(struct cpu_hw_events *cpuc, int idx,
>>>> c2->weight = hweight64(c2->idxmsk64);
>>>> }
>>>>
>>>> + if (x86_pmu.arch_pebs && event->attr.precise_ip) {
>>>> + u64 pebs_cntrs_mask;
>>>> + u64 cntrs_mask;
>>>> +
>>>> + if (event->attr.precise_ip >= 3)
>>>> + pebs_cntrs_mask = hybrid(pmu, arch_pebs_cap).pdists;
>>>> + else
>>>> + pebs_cntrs_mask = hybrid(pmu, arch_pebs_cap).counters;
>>>> +
>>>> + cntrs_mask = hybrid(pmu, fixed_cntr_mask64) << INTEL_PMC_IDX_FIXED |
>>>> + hybrid(pmu, cntr_mask64);
>>>> +
>>>> + if (pebs_cntrs_mask != cntrs_mask) {
>>>> + c2 = dyn_constraint(cpuc, c2, idx);
>>>> + c2->idxmsk64 &= pebs_cntrs_mask;
>>>> + c2->weight = hweight64(c2->idxmsk64);
>>>> + }
>>>> + }
>>> The pebs_cntrs_mask and cntrs_mask wouldn't be changed since the machine
>>> boot. I don't think it's efficient to calculate them every time.
>>>
>>> Maybe adding a local pebs_event_constraints_pdist[] and update both
>>> pebs_event_constraints[] and pebs_event_constraints_pdist[] with the
>>> enumerated mask at initialization time.
>>>
>>> Update the intel_pebs_constraints() to utilize the corresponding array
>>> according to the precise_ip.
>>>
>>> The above may be avoided.
>> Even we have these two arrays, we still need the dynamic constraint, right?
>> We can't predict what the event is, the event may be mapped to a quite
>> specific event constraint and we can know it in advance.
> The dynamic constraint is not necessary, but two arrays seems not
> enough. Because a PEBS event may fall back to the event_constraints as
> well. Sigh.
> Four arrays should be required. pebs_event_constraints[],
> pebs_event_constraints_pdist[], event_constraints_for_pebs[],
> event_constraints_for_pdist_pebs[].
> But it seems too complicated. It may not be implemented now.
>
> But, at least the pebs_cntrs_mask and cntrs_mask can be calculated in
> the hw_config(), or even intel_pmu_init() once. It should not be
> calculated every time in the critical path.
Yeah, these two counters mask are unnecessary to calculate at each call. It
looks we can further optimize this base on your dynamic constraints
optimization patch. Thanks.
>
> Thanks,
> Kan
>>
>>> Thanks,
>>> Kan
>>>
>>>> +
>>>> return c2;
>>>> }
>>>>
>>>> diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
>>>> index 2f2c6b7c801b..a573ce0e576a 100644
>>>> --- a/arch/x86/events/intel/ds.c
>>>> +++ b/arch/x86/events/intel/ds.c
>>>> @@ -2941,6 +2941,7 @@ static void __init intel_arch_pebs_init(void)
>>>> x86_pmu.pebs_buffer_size = PEBS_BUFFER_SIZE;
>>>> x86_pmu.drain_pebs = intel_pmu_drain_arch_pebs;
>>>> x86_pmu.pebs_capable = ~0ULL;
>>>> + x86_pmu.flags |= PMU_FL_PEBS_ALL;
>>>> }
>>>>
>>>> /*
>