Re: Basic perf PMU support for Haswell v12

From: Ingo Molnar
Date: Thu May 30 2013 - 03:02:03 EST



* Ingo Molnar <mingo@xxxxxxxxxx> wrote:

> * Andi Kleen <andi@xxxxxxxxxxxxxx> wrote:
>
> > On Tue, May 28, 2013 at 08:29:15AM +0200, Ingo Molnar wrote:
> > >
> > > * Andi Kleen <andi@xxxxxxxxxxxxxx> wrote:
> > >
> > > > All outstanding issues fixed I hope. And I added mem-loads/stores support.
> > > >
> > > > Contains support for:
> > > > - Basic Haswell PMU and PEBS support
> > > > - Late unmasking of the PMI
> > > > - mem-loads/stores support
> > > >
> > > > v2: Addressed Stephane's feedback. See individual patches for details.
> > > > v3: now even more bite-sized. Qualifier constraints merged earlier.
> > > > v4: Rename some variables, add some comments and other minor changes.
> > > > Add some Reviewed/Tested-bys.
> > > > v5: Address some minor review feedback. Port to latest perf/core
> > > > v6: Add just some variable names, add comments, edit descriptions, some
> > > > more testing, rebased to latest perf/core
> > > > v7: Expand comment
> > > > v8: Rename structure field.
> > > > v9: No wide counters, but add basic LBRs. Add some more
> > > > constraints. Rebase to 3.9rc1
> > > > v10: Change some whitespace. Rebase to 3.9rc3
> > > > v11: Rebase to perf/core. Fix extra regs. Rename INTX.
> > > > v12: Rebase to 3.10-rc2
> > > > Add mem-loads/stores support for parity with Sandy Bridge.
> > > > Fix fixed counters (Thanks Ingo!)
> > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > > > Make late ack optional
> > > > Export new config bits in sysfs.
> > > > Minor changes
> > >
> > > I reported a pretty nasty regression with the previous version (v10) which
> > > made this series break default 'perf top' on non-Haswell systems - but
> > > it's unclear from this changelog to what extent you managed to reproduce
> > > the bug and fix it, and what the fix was?
> >
> > Thanks for checking.
> > I didn't reproduce it, but I found a problem by code review with the
> > fixed counter constraints.
> >
> > I think I fixed it by adding this hunk:
> >
> > @@ -2227,7 +2313,7 @@ __init int intel_pmu_init(void)
> > * counter, so do not extend mask to generic counters
> > */
> > for_each_event_constraint(c, x86_pmu.event_constraints) {
> > - if (c->cmask != X86_RAW_EVENT_MASK
> > + if (c->cmask != FIXED_EVENT_FLAGS
> > || c->idxmsk64 == INTEL_PMC_MSK_FIXED_REF_CYCLES) {
> > continue;
> > }
> >
> > It would be cleaner to detect the fixed counters in some other way,
> > but that was the simplest fix I could find.
> >
> > Testing appreciated
>
> Fair enough - I'll give it a whirl - that hunk does indeed look like it
> could make a difference.

Ok, I can confirm that this fixed the perf top and perf record regression
I saw on Intel systems.

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/