Re: [RESEND PATCH V3 3/8] perf/x86/intel: Support hardware TopDown metrics

From: Peter Zijlstra
Date: Thu Aug 29 2019 - 09:53:11 EST


On Thu, Aug 29, 2019 at 09:31:37AM -0400, Liang, Kan wrote:
> On 8/28/2019 11:19 AM, Peter Zijlstra wrote:
> > > +static int icl_set_topdown_event_period(struct perf_event *event)
> > > +{
> > > + struct hw_perf_event *hwc = &event->hw;
> > > + s64 left = local64_read(&hwc->period_left);
> > > +
> > > + /*
> > > + * Clear PERF_METRICS and Fixed counter 3 in initialization.
> > > + * After that, both MSRs will be cleared for each read.
> > > + * Don't need to clear them again.
> > > + */
> > > + if (left == x86_pmu.max_period) {
> > > + wrmsrl(MSR_CORE_PERF_FIXED_CTR3, 0);
> > > + wrmsrl(MSR_PERF_METRICS, 0);
> > > + local64_set(&hwc->period_left, 0);
> > > + }
> > This really doesn't make sense to me; if you set FIXED_CTR3 := 0, you'll
> > never trigger the overflow there; this then seems to suggest the actual
> > counter value is irrelevant. Therefore you don't actually need this.
> >
>
> Could you please elaborate on why initialization to 0 never triggers an
> overflow?

Well, 'never' as in a 'long' time.

> As of my understanding, initialization to 0 only means that it will take
> more time than initialization to -max_period (0x8000 0000 0001) to trigger
> an overflow.

Only twice as long. And why do we care about that?

The problem with it is though that you get the overflow at the end of
the whole period, instead of halfway through, so reconstruction is
trickier.

> Maybe 0 is too tricky. We can set the initial value to 1.

That's even worse. I'm still not understanding why we can't use the
normal code.

> I think the bottom line is that we need a small initial value for FIXED_CTR3
> here.

But why?!

> PERF_METRICS reports an 8bit integer fraction which is something like 0xff *
> internal counters / FIXCTR3.
> The internal counters only start counting from 0. (SW cannot set an
> arbitrary initial value for internal counters.)
> If the initial value of FIXED_CTR3 is too big, PERF_METRICS could always
> remain constant, e.g. 0.

What what? The PERF_METRICS contents depends on the FIXCTR3 value ?!
That's bloody insane. /me goes find the SDM. The SDM is bloody useless
:-(.

Please give a complete and coherent description of all of this. I can't
very well review any of this until I know how the hardware works, now
can I.

In this write-up, include the exact condition for METRICS_OVF (the SDM
states: 'it indicates that PERF_METRIC counter has overflowed', which is
gramatically incorrect and makes no sense even with the missing article
injected).