Re: [PATCH 1/2] perf/core: Adding capability to disable PMUs event multiplexing

From: Ganapatrao Kulkarni
Date: Thu Nov 07 2019 - 10:45:23 EST


On Thu, Nov 7, 2019 at 6:52 AM Mark Rutland <mark.rutland@xxxxxxx> wrote:
>
> On Wed, Nov 06, 2019 at 03:28:46PM -0800, Ganapatrao Kulkarni wrote:
> > Hi Peter, Mark,
> >
> > On Wed, Nov 6, 2019 at 3:28 AM Mark Rutland <mark.rutland@xxxxxxx> wrote:
> > >
> > > On Wed, Nov 06, 2019 at 01:01:40AM +0000, Ganapatrao Prabhakerrao Kulkarni wrote:
> > > > When PMUs are registered, perf core enables event multiplexing
> > > > support by default. There is no provision for PMUs to disable
> > > > event multiplexing, if PMUs want to disable due to unavoidable
> > > > circumstances like hardware errata etc.
> > > >
> > > > Adding PMU capability flag PERF_PMU_CAP_NO_MUX_EVENTS and support
> > > > to allow PMUs to explicitly disable event multiplexing.
> > >
> > > Even without multiplexing, this PMU activity can happen when switching
> > > tasks, or when creating/destroying events, so as-is I don't think this
> > > makes much sense.
> > >
> > > If there's an erratum whereby heavy access to the PMU can lockup the
> > > core, and it's possible to workaround that by minimzing accesses, that
> > > should be done in the back-end PMU driver.
> >
> > As said in errata, If there are heavy access to memory like stream
> > application running and along with that if PMU control registers are
> > also accessed frequently, then CPU lockup is seen.
>
> Ok. So the issue is the frequency of access to those registers.
>
> Which registers does that apply to?

The control register which are used to start, stop the counter and the
register which is used to set the event type.
>
> Is this the case for only reads, only writes, or both?

It is write issue, the h/w block has limited write buffers and
overflow getting hardware in weird state, when memory transactions are
high.

>
> Does the frequency of access actually matter, or is is just more likely
> that we see the issue with a greater number of accesses? i.e the
> increased frequency increases the probability of hitting the issue.

This is one scenario.
Any higher access to PMU register, when memory is busy needs to be controlled.

>
> I'd really like a better description of the HW issue here.
>
> > I ran perf stat with 4 events of thuderx2 PMU as well as with 6 events
> > for stream application.
> > For 4 events run, there is no event multiplexing, where as for 6
> > events run the events are multiplexed.
> >
> > For 4 event run:
> > No of times pmu->add is called: 10
> > No of times pmu->del is called: 10
> > No of times pmu->read is called: 310
> >
> > For 6 events run:
> > No of times pmu->add is called: 5216
> > No of times pmu->del is called: 5216
> > No of times pmu->read is called: 5216
> >
> > Issue happens when the add and del are called too many times as seen
> > with 6 event case.
>
> Sure, but I can achieve similar by creating/destroying events in a loop.
> Multiplexing is _one_ way to cause this behaviour, but it's not the
> _only_ way.

I agree, there may be other use cases also, however i am trying to fix
the common use case.

>
> > The PMU hardware control registers are programmed when add and del
> > functions are called.
> > For pmu->read no issues since no h/w issue with the data path.
>
> As above, can you please describe the hardware conditions more
> thoroughly?
>
> > This is UNCORE driver, not sure context switch has any influence on this?
>
> I believe that today it's possible for this to happen for cgroup events,
> as non-sensical as it may be to have a cgroup-bound uncore PMU event.
>
> > Please suggest me, how can we fix this in back-end PMU driver without
> > any perf core help?
>
> In order to do so, I need a better explanation of the underlying
> hardware issue.

I will try to put more explanation to erratum, however please let me
know, if you have any specific questions.

>
> Thanks,
> Mark.

Thanks,
Ganapat