On 18/05/18 11:22, Suzuki K Poulose wrote:
Add support for chained event counters. PMUv3 allows chaining
a pair of adjacent PMU counters (with the lower counter number
being always "even"). The low counter is programmed to count
the event of interest and the high counter(odd numbered) is
programmed with a special event code (0x1e - Chain). Thus
we need special allocation schemes to make the full use of
available counters. So, we allocate the counters from either
ends. i.e, chained counters are allocated from the lower
end in pairs of two and the normal counters are allocated
from the higher number. Also makes necessary changes to
handle the chained events as a single event with 2 counters.
Cc: Mark Rutland <mark.rutland@xxxxxxx>
Cc: Will Deacon <will.deacon@xxxxxxx>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@xxxxxxx>
---
 arch/arm64/kernel/perf_event.c | 226 ++++++++++++++++++++++++++++++++++++-----
 1 file changed, 202 insertions(+), 24 deletions(-)
diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
index ea8e060..5f81cd0 100644
--- a/arch/arm64/kernel/perf_event.c
+++ b/arch/arm64/kernel/perf_event.c
@@ -446,9 +446,11 @@ static struct attribute_group armv8_pmuv3_events_attr_group = {
+static inline u64 armv8pmu_read_chain_counter(int idx)
+{
+ÂÂÂ u64 prev_hi, hi, lo;
+
+ÂÂÂ do {
+ÂÂÂÂÂÂÂ prev_hi = armv8pmu_read_evcntr(idx);
+ÂÂÂÂÂÂÂ isb();
+ÂÂÂÂÂÂÂ lo = armv8pmu_read_evcntr(idx - 1);
+ÂÂÂÂÂÂÂ isb();
+ÂÂÂÂÂÂÂ hi = armv8pmu_read_evcntr(idx);
+ÂÂÂÂÂÂÂ isb();
+ÂÂÂ } while (prev_hi != hi);
Is it worth trying to elide that last isb() in the highly likely case that we don't need it?
+static inline void armv8pmu_write_chain_counter(int idx, u64 value)
+{
+ÂÂÂ armv8pmu_write_evcntr(idx, value >> 32);
+ÂÂÂ isb();
+ÂÂÂ armv8pmu_write_evcntr(idx - 1, value);
+ÂÂÂ isb();
Either that isb() is unnecessary, or we are (and have been) missing one after a non-chained write.
 static inline int armv8pmu_disable_counter(int idx)
 {
ÂÂÂÂÂ u32 counter = ARMV8_IDX_TO_COUNTER(idx);
@@ -567,6 +669,24 @@ static inline int armv8pmu_disable_counter(int idx)
ÂÂÂÂÂ return idx;
 }
+static inline void armv8pmu_disable_event_counter(struct perf_event *event)
+{
+ÂÂÂ struct hw_perf_event *hwc = &event->hw;
Nit: might as well drop this and be consistent with the enable case.
 static int armv8pmu_get_event_idx(struct pmu_hw_events *cpuc,
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ struct perf_event *event)
 {
@@ -755,7 +915,10 @@ static int armv8pmu_get_event_idx(struct pmu_hw_events *cpuc,
ÂÂÂÂÂ struct hw_perf_event *hwc = &event->hw;
ÂÂÂÂÂ unsigned long evtype = hwc->config_base & ARMV8_PMU_EVTYPE_EVENT;
-ÂÂÂ /* Always prefer to place a cycle counter into the cycle counter. */
+ÂÂÂ /*
+ÂÂÂÂ * Always prefer to place a cycle counter into the cycle counter
+ÂÂÂÂ * irrespective of whether we are counting 32bit/64bit
I don't think that comment change adds much :/
 /*
@@ -845,8 +1016,14 @@ static int __armv8_pmuv3_map_event(struct perf_event *event,
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ &armv8_pmuv3_perf_cache_map,
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ ARMV8_PMU_EVTYPE_EVENT);
-ÂÂÂ if (hw_event_id == ARMV8_PMUV3_PERFCTR_CPU_CYCLES)
+ÂÂÂ if (hw_event_id == ARMV8_PMUV3_PERFCTR_CPU_CYCLES) {
+ÂÂÂÂÂÂÂ /* Prevent chaining for cycle counter */
Why? Sure, we want to avoid executing the chaining logic if we're scheduling a cycles event in the dedicated counter (which is perhaps what the comment above wanted to say), but if one ends up allocated into a regular counter (e.g. if the user asks for multiple cycle counts with different filters), then I don't see any reason to forbid that being chained.