Re: [PATCH] perf/core: generate overflow signal when samples are dropped (WAS: Re: [REGRESSION] perf/core: PMU interrupts dropped if we entered the kernel in the "skid" region)

From: Ingo Molnar
Date: Tue Jul 04 2017 - 06:04:30 EST

* Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:

> On Thu, Jun 29, 2017 at 10:12:33AM +0200, Ingo Molnar wrote:
> >
> > * Mark Rutland <mark.rutland@xxxxxxx> wrote:
> >
> > > It still seems wrong to make up data, though.
> >
> > So what we have here is a hardware quirk: we asked for user-space samples, but
> > didn't get them and we cannot expose the kernel-internal address.
> >
> > The question is, how do we handle the hardware quirk. Since we cannot fix the
> > hardware on existing systems there's really just two choices:
> >
> > - Lose the sample (and signal it as a lost sample)
> >
> > - Keep the sample but change the sensitive kernel-internal address to something
> > that is not sensitive: 0 or -1 works, but we could perhaps also return a
> > well-known user-space address such as the vDSO syscall trampoline or such?
> >
> > there's no other option really.
> >
> > I'd lean towards Vince's take: losing samples is more surprising than getting the
> > occasional sample with some sanitized data in it.
> >
> > If we make the artificial data still a meaningful user-space address, related to
> > kernel entries, then it might even be a bonus, as users would learn to recognize
> > it as: 'oh, skid artifact, I know about that'.
> So while we could easily fake SAMPLE_IP to do as you suggest, other
> entries might be much harder to fake. That said, I have no problems with
> just 0 stuffing them.
> The only real problem is determining how much to stuff I suppose.

I think the RIP is the most important one to fix up in an informative fashion
(instead of just zeroing it out), so that mainstream users of 'perf top' or
'perf report' have a chance to see that certain entries have this skid artifact.

The other registers should be zeroed out once we stop trusting a sample.