Re: [RESEND PATCH 01/12] perf/x86/intel: Support the PEBS event mask

From: Liang, Kan
Date: Fri Jun 21 2024 - 10:19:45 EST




On 2024-06-20 11:58 a.m., Liang, Kan wrote:
>
>
> On 2024-06-20 3:02 a.m., Peter Zijlstra wrote:
>> On Tue, Jun 18, 2024 at 08:10:33AM -0700, kan.liang@xxxxxxxxxxxxxxx wrote:
>>> diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
>>> index e010bfed8417..a0104c82baed 100644
>>> --- a/arch/x86/events/intel/ds.c
>>> +++ b/arch/x86/events/intel/ds.c
>>
>>> @@ -2157,6 +2157,7 @@ static void intel_pmu_drain_pebs_nhm(struct pt_regs *iregs, struct perf_sample_d
>>> void *base, *at, *top;
>>> short counts[INTEL_PMC_IDX_FIXED + MAX_FIXED_PEBS_EVENTS] = {};
>>> short error[INTEL_PMC_IDX_FIXED + MAX_FIXED_PEBS_EVENTS] = {};
>>> + int max_pebs_events = hweight64(x86_pmu.pebs_events_mask);
>>
>> Consider pebs_events_mask = 0xf0, then max_pebs_events becomes 4.
>
> The intel_pmu_drain_pebs_nhm() is a NHM specific function. It's
> impossible that there is a pebs_events_mask = 0xf0.
>
> There are only 4 counters. The mask should always be 0xf.
>>
>>> int bit, i, size;
>>> u64 mask;
>>>
>>> @@ -2168,8 +2169,8 @@ static void intel_pmu_drain_pebs_nhm(struct pt_regs *iregs, struct perf_sample_d
>>>
>>> ds->pebs_index = ds->pebs_buffer_base;
>>>
>>> - mask = (1ULL << x86_pmu.max_pebs_events) - 1;
>>> - size = x86_pmu.max_pebs_events;
>>> + mask = x86_pmu.pebs_events_mask;
>>> + size = max_pebs_events;
>>
>> This is wrong.. you have 8 bits to iterate, of which only the top 4 are
>> set.
>>
>>> if (x86_pmu.flags & PMU_FL_PEBS_ALL) {
>>> mask |= ((1ULL << x86_pmu.num_counters_fixed) - 1) << INTEL_PMC_IDX_FIXED;
>>> size = INTEL_PMC_IDX_FIXED + x86_pmu.num_counters_fixed;
>>> @@ -2208,8 +2209,8 @@ static void intel_pmu_drain_pebs_nhm(struct pt_regs *iregs, struct perf_sample_d
>>> pebs_status = p->status = cpuc->pebs_enabled;
>>>
>>> bit = find_first_bit((unsigned long *)&pebs_status,
>>> - x86_pmu.max_pebs_events);
>>> - if (bit >= x86_pmu.max_pebs_events)
>>> + max_pebs_events);
>>> + if (bit >= max_pebs_events)
>>> continue;
>>
>> But the bit check here then truncates the thing to the lower 4 bits
>> which are all 0.
>>
>> Should this not be something like:
>>
>> if (!(pebs_event_mask & (1 << bit)))
>> continue;
>>
>> ?
>>
>
> For the existing hardware, I don't think it's necessary. The counters
> are contiguous.
>

Think about it twice. Although either code works, we should try to make
the code as generic as possible.

I will introduce a generic x86_pmu_max_num_pebs() and check the highest
set bit, rather than hweight64. It can be used by both NHM and the
future platforms.


Thanks,
Kan