Re: [PATCH 3/3] perf, x86: Add INST_RETIRED.ALL workarounds

From: Peter Zijlstra
Date: Mon Mar 23 2015 - 06:19:50 EST


On Mon, Mar 23, 2015 at 10:38:54AM +0100, Ingo Molnar wrote:
>
> * Andi Kleen <andi@xxxxxxxxxxxxxx> wrote:
>
> > From: Andi Kleen <ak@xxxxxxxxxxxxxxx>
> >
> > On Broadwell INST_RETIRED.ALL cannot be used with any period
> > that doesn't have the lowest 6 bits cleared. And the period
> > should not be smaller than 128.
>
> Sloppy changelog: a most basic question is not answered by the
> changelog: what happens in practice when the period is set to a
> smaller value than 128?

http://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/5th-gen-core-family-spec-update.pdf

BDM11 and BDM55 (not 57) tell us that the PMU will generate crap output
if you don't do this. Non-fatal but gibberish.

> > +/*
> > + * Broadwell:
> > + * The INST_RETIRED.ALL period always needs to have lowest
> > + * 6bits cleared (BDM57). It shall not use a period smaller
> > + * than 100 (BDM11). We combine the two to enforce
> > + * a min-period of 128.
> > + */
>
> Sloppy comment: that's not what we do:
>
> > +static unsigned bdw_limit_period(struct perf_event *event, unsigned left)
> > +{
> > + if ((event->hw.config & INTEL_ARCH_EVENT_MASK) ==
> > + X86_CONFIG(.event=0xc0, .umask=0x01)) {
> > + if (left < 128)
> > + left = 128;
> > + left &= ~0x3fu;
> > + }
> > + return left;
>
> We enforce a minimum period of 128 and round the requested period to
> 64.

Not quite, we enforce a min period of 128 but otherwise mask bit0-5, no
rounding up.

Which is pretty much what the comment says.

> I think in this case it would be useful to tooling if we updated the
> syscall attribute with the real period value that was used, to not
> skew tooling output.

Seeing how we already have a fuzz of up to sample_period events; we
don't know how far into the last period we are when we stop the event,
it might have been 1 event away from generating a PMI, this patch
doesn't actually add significantly to that.

Also, the effective period is the one specified, if the requested period
< 128 we simply reject the event creation. If its any larger we iterate
around the requested sample period with steps of 64 but such that we
average out on the requested period. There is no 'real' period to copy
back.

Another way to look at this is that we use a form of pulse density
modulation to create the desired period using the larger step size; or
perhaps compare it to Breshenham's line drawing algorithm.



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/