Re: [PATCH v1 00/13] perf/x86/amd: Add AMD Fam19h Branch Sampling support

From: Peter Zijlstra
Date: Wed Sep 15 2021 - 05:05:03 EST


On Tue, Sep 14, 2021 at 10:55:12PM -0700, Stephane Eranian wrote:
> On Thu, Sep 9, 2021 at 1:55 AM Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> >
> > On Thu, Sep 09, 2021 at 12:56:47AM -0700, Stephane Eranian wrote:
> > > This patch series adds support for the AMD Fam19h 16-deep branch sampling
> > > feature as described in the AMD PPR Fam19h Model 01h Revision B1 section 2.1.13.
> >
> > Yay..
> >
> > > BRS interacts with the NMI interrupt as well. Because enabling BRS is expensive,
> > > it is only activated after P event occurrences, where P is the desired sampling period.
> > > At P occurrences of the event, the counter overflows, the CPU catches the NMI interrupt,
> > > activates BRS for 16 branches until it saturates, and then delivers the NMI to the kernel.
> >
> > WTF... ?!? Srsly? You're joking right?
> >
>
> As I said, this is because of the cost of running BRS usually for
> millions of branches to keep only the last 16.
> Running branch sampling in general on any arch is never totally free.

Holding up the NMI will disrupt the sampling of the other events, which
is, IMO unacceptible and would require this event to be exclusive on the
whole PMU, simply because sharing it doesn't work.

(also, other NMI sources might object)

Also, by only having LBRs post overflow you can't apply LBR based
analysis to other events, which seems quite limiting.

This really seems like a very sub-optimal solution. I mean, it's awesome
AMD gets branch records, but this seems a very poor solution.