Re: [PATCH 2/2] soc: qcom: add l2 cache perf events driver

From: Neil Leeder
Date: Fri Jun 10 2016 - 18:34:15 EST




On 6/9/2016 03:41 PM, Peter Zijlstra wrote:
> On Thu, Jun 09, 2016 at 04:56:16PM +0100, Mark Rutland wrote:
>>>>> +static irqreturn_t l2_cache__handle_irq(int irq_num, void *data)
>>>>> +{
>>>>> + struct hml2_pmu *slice = data;
>>>>> + u32 ovsr;
>>>>> + int idx;
>>>>> + struct pt_regs *regs;
>>>>> +
>>>>> + ovsr = hml2_pmu__getreset_ovsr();
>>>>> + if (!hml2_pmu__has_overflowed(ovsr))
>>>>> + return IRQ_NONE;
>>>>> +
>>>>> + regs = get_irq_regs();
>>>>> +
>>>>> + for (idx = 0; idx < l2cache_pmu.num_counters; idx++) {
>>>>> + struct perf_event *event = slice->events[idx];
>>>>> + struct hw_perf_event *hwc;
>>>>> + struct perf_sample_data data;
>>>>> +
>>>>> + if (!event)
>>>>> + continue;
>>>>> +
>>>>> + if (!hml2_pmu__counter_has_overflowed(ovsr, idx))
>>>>> + continue;
>>>>> +
>>>>> + l2_cache__event_update_from_slice(event, slice);
>>>>> + hwc = &event->hw;
>>>>> +
>>>>> + if (is_sampling_event(event)) {
>>>>> + perf_sample_data_init(&data, 0, hwc->last_period);
>>>>
>>>> I don't think sampling makes sense, given this is an uncore PMU and the
>>>> events are triggered by other CPUs.
>>>
>>> There is origin filtering so events can be attributed to a CPU when sampling.
>>
>> Ok. I believe that's different from all other uncore PMUs we support
>> (none of the drivers support sampling, certainly), so I'm not entirely
>> sure how/if we can make use of that sanely and reliably.
>
> Right; because not only do you need to know which CPU originated the
> event, the IRQ must also happen on that CPU. Simply knowing which CPU
> triggered it is not enough for sampling.
>
>> For the timebeing, I think this sampling needs to go, and the event_init
>> logic needs to reject sampling as with other uncore PMU drivers.
>
> Agreed.

I want to make sure I understand what the concern is here.
Given the hardware filter which restricts counting to events generated by
a specific CPU, and an irq which is affine to that CPU, sampling and task mode
would seem to work for a single perf use.
Is the issue only related to multiple concurrent perf uses?

>
>> One thing I forgot to mention in my earlier comments is that as an
>> uncore PMU you need to have task_ctx_nr = perf_invalid_context here
>> also.
>
> For good reasons, uncore PMUs (as is the case here) count strictly more
> than the effect of single CPUs (and thus also the current task). So
> attributing it back to a task is nonsense.
>

Thanks,
Neil

--
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project.