Re: [PATCH 2/2] perf/x86/amd: Add support for Large Increment per Cycle Events

From: Peter Zijlstra
Date: Fri Dec 20 2019 - 07:10:02 EST


On Thu, Nov 14, 2019 at 12:37:20PM -0600, Kim Phillips wrote:

I still hate the naming on this, "large increment per cycle" is just a
random bunch of words collected by AMD marketing or somesuch. It doesn't
convey the fundamental point that counters get paired. So I've done a
giant bunch of search and replace on it for you.

> @@ -621,6 +622,8 @@ void x86_pmu_disable_all(void)
> continue;
> val &= ~ARCH_PERFMON_EVENTSEL_ENABLE;
> wrmsrl(x86_pmu_config_addr(idx), val);
> + if (is_large_inc(hwc))
> + wrmsrl(x86_pmu_config_addr(idx + 1), 0);
> }
> }
>

See, the above makes sense, it doesn't assume anything about where
config for idx and config for idx+1 are, and then here:

> @@ -855,6 +871,9 @@ static inline void x86_pmu_disable_event(struct perf_event *event)
> struct hw_perf_event *hwc = &event->hw;
>
> wrmsrl(hwc->config_base, hwc->config);
> +
> + if (is_large_inc(hwc))
> + wrmsrl(hwc->config_base + 2, 0);
> }

You hard-code the offset as being 2. I fixed that for you.

> @@ -849,14 +862,19 @@ int perf_assign_events(struct event_constraint **constraints, int n,
> int wmin, int wmax, int gpmax, int *assign)
> {
> struct perf_sched sched;
> + struct event_constraint *c;
>
> perf_sched_init(&sched, constraints, n, wmin, wmax, gpmax);
>
> do {
> if (!perf_sched_find_counter(&sched))
> break; /* failed */
> - if (assign)
> + if (assign) {
> assign[sched.state.event] = sched.state.counter;
> + c = constraints[sched.state.event];
> + if (c->flags & PERF_X86_EVENT_LARGE_INC)
> + sched.state.counter++;
> + }
> } while (perf_sched_next_event(&sched));
>
> return sched.state.unassigned;

I'm still confused by this bit. AFAICT it serves no purpose.
perf_sched_next_event() will reset sched->state.counter to 0 on every
pass anyway.

I've deleted it for you.

> @@ -926,10 +944,14 @@ int x86_schedule_events(struct cpu_hw_events *cpuc, int n, int *assign)
> break;
>
> /* not already used */
> - if (test_bit(hwc->idx, used_mask))
> + if (test_bit(hwc->idx, used_mask) || (is_large_inc(hwc) &&
> + test_bit(hwc->idx + 1, used_mask)))
> break;
>
> __set_bit(hwc->idx, used_mask);
> + if (is_large_inc(hwc))
> + __set_bit(hwc->idx + 1, used_mask);
> +
> if (assign)
> assign[i] = hwc->idx;
> }

This is just really sad.. fixed that too.

Can you verify the patches in:

git://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git perf/amd

work?