Re: [PATCH V6 09/14] perf/x86/intel: Support TopDown metrics on Ice Lake

From: Peter Zijlstra
Date: Tue Jul 21 2020 - 08:40:52 EST


On Fri, Jul 17, 2020 at 07:05:49AM -0700, kan.liang@xxxxxxxxxxxxxxx wrote:

> +static inline u64 icl_get_metrics_event_value(u64 metric, u64 slots, int idx)
> +{
> + u32 val;
> +
> + /*
> + * The metric is reported as an 8bit integer fraction
> + * suming up to 0xff.
> + * slots-in-metric = (Metric / 0xff) * slots
> + */
> + val = (metric >> ((idx - INTEL_PMC_IDX_METRIC_BASE) * 8)) & 0xff;
> + return mul_u64_u32_div(slots, val, 0xff);
> +}
> +
> +static void __icl_update_topdown_event(struct perf_event *event,
> + u64 slots, u64 metrics)
> +{
> + int idx = event->hw.idx;
> + u64 delta;
> +
> + if (is_metric_idx(idx))
> + delta = icl_get_metrics_event_value(metrics, slots, idx);
> + else
> + delta = slots;
> +
> + local64_add(delta, &event->count);
> +}
> +
> +/*
> + * Update all active Topdown events.
> + *
> + * The PERF_METRICS and Fixed counter 3 are read separately. The values may be
> + * modify by a NMI. PMU has to be disabled before calling this function.
> + */
> +static u64 icl_update_topdown_event(struct perf_event *event)
> +{
> + struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
> + struct perf_event *other;
> + u64 slots, metrics;
> + int idx;
> +
> + /* read Fixed counter 3 */
> + rdpmcl((3 | INTEL_PMC_FIXED_RDPMC_BASE), slots);
> + if (!slots)
> + return 0;
> +
> + /* read PERF_METRICS */
> + rdpmcl(INTEL_PMC_FIXED_RDPMC_METRICS, metrics);
> +
> + for_each_set_bit(idx, cpuc->active_mask, INTEL_PMC_IDX_TD_BE_BOUND + 1) {
> + if (!is_topdown_idx(idx))
> + continue;
> + other = cpuc->events[idx];
> + __icl_update_topdown_event(other, slots, metrics);
> + }
> +
> + /*
> + * Check and update this event, which may have been cleared
> + * in active_mask e.g. x86_pmu_stop()
> + */
> + if (event && !test_bit(event->hw.idx, cpuc->active_mask))
> + __icl_update_topdown_event(event, slots, metrics);
> +
> + /*
> + * Software is recommended to periodically clear both registers
> + * in order to maintain accurate measurements, which is required for
> + * certain scenarios that involve sampling metrics at high rates.

I'll maintain that that statement is clear as mud and therefore useless.

> + * Software should always write fixed counter 3 before write to
> + * PERF_METRICS.
> + */
> + wrmsrl(MSR_CORE_PERF_FIXED_CTR3, 0);
> + wrmsrl(MSR_PERF_METRICS, 0);
> +
> + return slots;
> +}

So in the normal case, this ends up calling into this function _5_
times, once for each event. Now the first time, it will actually do
something useful, but the other 4 times it's just wasting cycles.

Also, that for_each_set_bit() loop, trying to find the events to
update...

Can't we, instead, make the SLOTS update advance 5 running counters in
cpuc and feed the events off of that?